git.sur5r.net Git - bacula/docs/blob - docs/manual/catmaintenance.tex

   1 %%
   2 %%
   3
   4 \section*{Catalog Maintenance}
   5 \label{_ChapterStart12}
   6 \index[general]{Maintenance!Catalog }
   7 \index[general]{Catalog Maintenance }
   8 \addcontentsline{toc}{section}{Catalog Maintenance}
   9
  10 Without proper setup and maintenance, your Catalog may continue to grow
  11 indefinitely as you run Jobs and backup Files, and/or it may become
  12 very inefficient and slow. How fast the size of your
  13 Catalog grows depends on the number of Jobs you run and how many files they
  14 backup. By deleting records within the database, you can make space available
  15 for the new records that will be added during the next Job. By constantly
  16 deleting old expired records (dates older than the Retention period), your
  17 database size will remain constant.
  18
  19 If you started with the default configuration files, they already contain
  20 reasonable defaults for a small number of machines (less than 5), so if you
  21 fall into that case, catalog maintenance will not be urgent if you have a few
  22 hundred megabytes of disk space free. Whatever the case may be, some knowledge
  23 of retention periods will be useful.
  24 \label{Retention}
  25
  26 \subsection*{Setting Retention Periods}
  27 \index[general]{Setting Retention Periods }
  28 \index[general]{Periods!Setting Retention }
  29 \addcontentsline{toc}{subsection}{Setting Retention Periods}
  30
  31 {\bf Bacula} uses three Retention periods: the {\bf File Retention} period,
  32 the {\bf Job Retention} period, and the {\bf Volume Retention} period. Of
  33 these three, the File Retention period is by far the most important in
  34 determining how large your database will become.
  35
  36 The {\bf File Retention} and the {\bf Job Retention} are specified in each
  37 Client resource as is shown below. The {\bf Volume Retention} period is
  38 specified in the Pool resource, and the details are given in the next chapter
  39 of this manual.
  40
  41 \begin{description}
  42
  43 \item [File Retention = \lt{}time-period-specification\gt{}]
  44    \index[dir]{File Retention  }
  45    The  File Retention record defines the length of time that  Bacula will keep
  46 File records in the Catalog database.  When this time period expires, and if
  47 {\bf AutoPrune} is set to {\bf yes}, Bacula will prune (remove) File records
  48 that  are older than the specified File Retention period. The pruning  will
  49 occur at the end of a backup Job for the given Client.  Note that the Client
  50 database record contains a copy of the  File and Job retention periods, but
  51 Bacula uses the  current values found in the Director's Client resource to  do
  52 the pruning.
  53
  54 Since File records in the database account for probably 80 percent of the
  55 size of the database, you should carefully determine exactly what File
  56 Retention period you need. Once the File records have been removed from
  57 the database, you will no longer be able to restore individual files
  58 in a Job. However, with Bacula version 1.37 and later, as long as the
  59 Job record still exists, you will be able to restore all files in the
  60 job.
  61
  62 Retention periods are specified in seconds, but as a convenience, there are
  63 a number of modifiers that permit easy specification in terms of minutes,
  64 hours, days, weeks, months, quarters, or years on the record.  See the
  65 \ilink{ Configuration chapter}{Time} of this manual for additional details
  66 of modifier specification.
  67
  68 The default File retention period is 60 days.
  69
  70 \item [Job Retention = \lt{}time-period-specification\gt{}]
  71    \index[dir]{Job Retention  }
  72    The Job Retention record defines the length of time that {\bf Bacula}
  73 will keep Job records in the Catalog database.  When this time period
  74 expires, and if {\bf AutoPrune} is set to {\bf yes} Bacula will prune
  75 (remove) Job records that are older than the specified Job Retention
  76 period.  Note, if a Job record is selected for pruning, all associated File
  77 and JobMedia records will also be pruned regardless of the File Retention
  78 period set.  As a consequence, you normally will set the File retention
  79 period to be less than the Job retention period.
  80
  81 As mentioned above, once the File records are removed from the database,
  82 you will no longer be able to restore individual files from the Job.
  83 However, as long as the Job record remains in the database, you will be
  84 able to restore all the files backuped for the Job (on version 1.37 and
  85 later). As a consequence, it is generally a good idea to retain the Job
  86 records much longer than the File records.
  87
  88 The retention period is specified in seconds, but as a convenience, there
  89 are a number of modifiers that permit easy specification in terms of
  90 minutes, hours, days, weeks, months, quarters, or years.  See the \ilink{
  91 Configuration chapter}{Time} of this manual for additional details of
  92 modifier specification.
  93
  94 The default Job Retention period is 180 days.
  95
  96 \item [AutoPrune = \lt{}yes/no\gt{}]
  97    \index[dir]{AutoPrune  }
  98    If AutoPrune is set to  {\bf yes} (default), Bacula will  automatically apply
  99 the File retention period and the Job  retention period for the Client at the
 100 end of the Job.
 101
 102 If you turn this off by setting it to {\bf no}, your  Catalog will grow each
 103 time you run a Job.
 104 \end{description}
 105
 106 \label{CompactingMySQL}
 107 \subsection*{Compacting Your MySQL Database}
 108 \index[general]{Database!Compacting Your MySQL }
 109 \index[general]{Compacting Your MySQL Database }
 110 \addcontentsline{toc}{subsection}{Compacting Your MySQL Database}
 111
 112 Over time, as noted above, your database will tend to grow. I've noticed that
 113 even though Bacula regularly prunes files, {\bf MySQL} does not effectively
 114 use the space, and instead continues growing. To avoid this, from time to
 115 time, you must compact your database. Normally, large commercial database such
 116 as Oracle have commands that will compact a database to reclaim wasted file
 117 space. MySQL has the {\bf OPTIMIZE TABLE} command that you can use, and SQLite
 118 version 2.8.4 and greater has the {\bf VACUUM} command. We leave it to you to
 119 explore the utility of the {\bf OPTIMIZE TABLE} command in MySQL.
 120
 121 All database programs have some means of writing the database out in ASCII
 122 format and then reloading it. Doing so will re-create the database from
 123 scratch producing a compacted result, so below, we show you how you can do
 124 this for MySQL, PostgreSQL and SQLite.
 125
 126 For a {\bf MySQL} database, you could write the Bacula database as an ASCII
 127 file (bacula.sql) then reload it by doing the following:
 128
 129 \footnotesize
 130 \begin{verbatim}
 131 mysqldump -f --opt bacula > bacula.sql
 132 mysql bacula < bacula.sql
 133 rm -f bacula.sql
 134 \end{verbatim}
 135 \normalsize
 136
 137 Depending on the size of your database, this will take more or less time and a
 138 fair amount of disk space. For example, if I cd to the location of the MySQL
 139 Bacula database (typically /opt/mysql/var or something similar) and enter:
 140
 141 \footnotesize
 142 \begin{verbatim}
 143 du bacula
 144 \end{verbatim}
 145 \normalsize
 146
 147 I get {\bf 620,644} which means there are that many blocks containing 1024
 148 bytes each or approximately 635 MB of data. After doing the {\bf msqldump}, I
 149 had a bacula.sql file that had {\bf 174,356} blocks, and after doing the {\bf
 150 mysql} command to recreate the database, I ended up with a total of {\bf
 151 210,464} blocks rather than the original {\bf 629,644}. In other words, the
 152 compressed version of the database took approximately one third of the space
 153 of the database that had been in use for about a year.
 154
 155 As a consequence, I suggest you monitor the size of your database and from
 156 time to time (once every 6 months or year), compress it.
 157
 158 \label{DatabaseRepair}
 159 \label{RepairingMySQL}
 160 \subsection*{Repairing Your MySQL Database}
 161 \index[general]{Database!Repairing Your MySQL }
 162 \index[general]{Repairing Your MySQL Database }
 163 \addcontentsline{toc}{subsection}{Repairing Your MySQL Database}
 164
 165 If you find that you are getting errors writing to your MySQL database, or
 166 Bacula hangs each time it tries to access the database, you should consider
 167 running MySQL's database check and repair routines. The program you need to
 168 run depends on the type of database indexing you are using. If you are using
 169 the default, you will probably want to use {\bf myisamchk}. For more details
 170 on how to do this, please consult the MySQL document at:
 171 \elink{
 172 http://www.mysql.com/doc/en/Repair.html}
 173 {http://www.mysql.com/doc/en/Repair.html}.
 174
 175 If the errors you are getting are simply SQL warnings, then you might try
 176 running dbcheck before (or possibly after) using the MySQL database repair
 177 program. It can clean up many of the orphaned record problems, and certain
 178 other inconsistencies in the Bacula database.
 179
 180 If you are running into the error {\bf The table 'File' is full ...},
 181 it is probably because on version 4.x MySQL, the table is limited by
 182 default to a maximum size of 4 GB and you have probably run into
 183 the limit. The solution can be found at:
 184 \elink{http://dev.mysql.com/doc/refman/5.0/en/full-table.html}
 185 {http://dev.mysql.com/doc/refman/5.0/en/full-table.html}
 186
 187 You can display the maximum length of your table with:
 188
 189 \footnotesize
 190 \begin{verbatim}
 191 mysql bacula
 192 SHOW TABLE STATUS FROM bacula like "File";
 193 \end{verbatim}
 194 \normalsize
 195
 196 If the column labeld "Max_data_length" is around 4Gb, this is likely
 197 to be the source of your problem, and you can modify it with:
 198
 199 \footnotesize
 200 \begin{verbatim}
 201 mysql bacula
 202 ALTER TABLE File MAX_ROWS=281474976710656;
 203 \end{verbatim}
 204 \normalsize
 205
 206 Alternatively you can modify your /etc/my.conf file before creating the
 207 Bacula tables, and in the [mysqld] section set:
 208
 209 \footnotesize
 210 \begin{verbatim}
 211 set-variable = myisam_data_pointer_size=6
 212 \end{verbatim}
 213 \normalsize
 214
 215 The above myisam data pointer size must be made before you create your
 216 Bacula tables or it will have no effect.
 217
 218 The row and pointer size changes should already be the default on MySQL
 219 version 5.x, so making these changes should only be necessary on MySQL 4.x
 220 depending on the size of your catalog database.
 221
 222
 223 \label{RepairingPSQL}
 224 \subsection*{Repairing Your PostgreSQL Database}
 225 \index[general]{Database!Repairing Your PostgreSQL }
 226 \index[general]{Repairing Your PostgreSQL Database }
 227 \addcontentsline{toc}{subsection}{Repairing Your PostgreSQL Database}
 228
 229 The same considerations apply that are indicated above for MySQL. That is,
 230 consult the PostgreSQL documents for how to repair the database, and also
 231 consider using Bacula's dbcheck program if the conditions are reasonable for
 232 using (see above).
 233
 234 \label{DatabasePerformance}
 235 \subsection*{Database Performance Issues}
 236 \index[general]{Database Performance Issues}
 237 \index[general]{Performance!Database}
 238 \addcontentsline{toc}{subsection}{Database Performance Issues}
 239
 240 There are a considerable number of ways each of the databases can be
 241 tuned to improve the performance. Going from an untuned database to one
 242 that is properly tuned can make a difference of a factor of 100 or more
 243 in the time to insert or search for records.
 244
 245 For each of the databases, you may get significant improvements by adding
 246 additional indexes. The comments in the Bacula make\_xxx\_tables give some
 247 indications as to what indexes may be appropriate.  Please see below
 248 for specific instructions on checking indexes.
 249
 250 For MySQL, what seems to be very important is to use the examine the
 251 my.cnf file. You may obtain significant performances by switching to
 252 the my-large.cnf or my-huge.cnf files that come with the MySQL source
 253 code.
 254
 255 For SQLite3, one significant factor in improving the performance is
 256 to ensure that there is a "PRAGMA synchronous = NORMAL;" statement.
 257 This reduces the number of times that the database flushes the in memory
 258 cache to disk. There are other settings for this PRAGMA that can
 259 give even further performance improvements at the risk of a database
 260 corruption if your system crashes.
 261
 262 For PostgreSQL, you might want to consider turning fsync off.  Of course
 263 doing so can cause corrupted databases in the event of a machine crash.
 264 There are many different ways that you can tune PostgreSQL, the
 265 following document discusses a few of them:
 266 \elink{
 267 http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html}
 268 {http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html}.
 269
 270 There is also a PostgreSQL FAQ question number 3.3 that may
 271 answer some of your questions about how to improve performance
 272 of the PostgreSQL engine:
 273 \elink{
 274 http://www.postgresql.org/docs/faqs.FAQ.html\#3.3}
 275 {http://www.postgresql.org/docs/faqs.FAQ.html\#3.3}.
 276
 277
 278 \subsection*{Performance Issues Indexes}
 279 \index[general]{Database Performance Issues Indexes}
 280 \index[general]{Performance!Database}
 281 \addcontentsline{toc}{subsection}{Database Performance Issues Indexes}
 282 One of the most important considerations for improving performance on
 283 the Bacula database is to ensure that it has all the appropriate indexes.
 284 Several users have reported finding that their database did not have
 285 all the indexes in the default configuration.  In addition, you may
 286 find that because of your own usage patterns, you need additional indexes.
 287
 288 The most important indexes for performance are the three indexes on the
 289 {\bf File} table.  The first index is on {\bf FileId} and is automatically
 290 made because it is the unique key used to access the table.  The other
 291 two are the JobId index and the (Filename, PathId) index.  If these Indexes
 292 are not present, your peformance may suffer a lot.
 293
 294 \subsubsection*{PostgreSQL Indexes}
 295 On PostgreSQL, you can check to see if you have the proper indexes using
 296 the following commands:
 297
 298 \footnotesize
 299 \begin{verbatim}
 300 psql bacula
 301 select * from pg_indexes where tablename='file';
 302 \end{verbatim}
 303 \normalsize
 304
 305 If you do not see output that indicates that all three indexes
 306 are created, you can create the two additional indexes using:
 307
 308 \footnotesize
 309 \begin{verbatim}
 310 psql bacula
 311 CREATE INDEX file_jobid_idx on file (jobid);
 312 CREATE INDEX file_fp_idx on file (filenameid, pathid);
 313 \end{verbatim}
 314 \normalsize
 315
 316 \subsubsection*{MySQL Indexes}
 317 On MySQL, you can check if you have the proper indexes by:
 318
 319 \footnotesize
 320 \begin{verbatim}
 321 mysql bacula
 322 show index from File;
 323 \end{verbatim}
 324 \normalsize
 325
 326 If the indexes are not present, especially the JobId index, you can
 327 create them with the following commands:
 328
 329 \footnotesize
 330 \begin{verbatim}
 331 mysql bacula
 332 CREATE INDEX file_jobid_idx on File (JobId);
 333 CREATE INDEX file_jpf_idx on File (Job, FilenameId, PathId);
 334 \end{verbatim}
 335 \normalsize
 336
 337 Though normally not a problem, you should ensure that the indexes
 338 defined for Filename and Path are both set to 255 characters. Some users
 339 reported performance problems when their indexes were set to 50 characters.
 340 To check, do:
 341
 342 \footnotesize
 343 \begin{verbatim}
 344 mysql bacula
 345 show index from Filename;
 346 show index from Path;
 347 \end{verbatim}
 348 \normalsize
 349
 350 and what is important is that for Filename, you have an index with
 351 Key_name "Name" and Sub_part "255". For Pth, you should have a Key_name
 352 "Path" and Sub_part "255".  If one or the other does not exist or the
 353 Sub_part is less that 255, you can drop and recreate the appropriate
 354 index with:
 355
 356 \footnotesize
 357 \begin{verbatim}
 358 mysql bacula
 359 DROP INDEX Path on Path;
 360 CREATE INDEX Path on Path (Path(255);
 361
 362 DROP INDEX Name on Filename;
 363 CREATE INDEX Name on Filename (Name(255));
 364 \end{verbatim}
 365 \normalsize
 366
 367
 368 \subsubsection*{SQLite Indexes}
 369 On SQLite, you can check if you have the proper indexes by:
 370
 371 \footnotesize
 372 \begin{verbatim}
 373 sqlite <path>bacula.db
 374 select * from sqlite_master where type='index' and tbl_name='File';
 375 \end{verbatim}
 376 \normalsize
 377
 378 If the indexes are not present, especially the JobId index, you can
 379 create them with the following commands:
 380
 381 \footnotesize
 382 \begin{verbatim}
 383 mysql bacula
 384 CREATE INDEX file_jobid_idx on File (JobId);
 385 CREATE INDEX file_jfp_idx on File (Job, FilenameId, PathId);
 386 \end{verbatim}
 387 \normalsize
 388
 389
 390
 391 \label{CompactingPostgres}
 392 \subsection*{Compacting Your PostgreSQL Database}
 393 \index[general]{Database!Compacting Your PostgreSQL }
 394 \index[general]{Compacting Your PostgreSQL Database }
 395 \addcontentsline{toc}{subsection}{Compacting Your PostgreSQL Database}
 396
 397 Over time, as noted above, your database will tend to grow. I've noticed that
 398 even though Bacula regularly prunes files, PostgreSQL has a {\bf VACUUM}
 399 command that will compact your database for you. Alternatively you may want to
 400 use the {\bf vacuumdb} command, which can be run from a cron job.
 401
 402 All database programs have some means of writing the database out in ASCII
 403 format and then reloading it. Doing so will re-create the database from
 404 scratch producing a compacted result, so below, we show you how you can do
 405 this for PostgreSQL.
 406
 407 For a {\bf PostgreSQL} database, you could write the Bacula database as an
 408 ASCII file (bacula.sql) then reload it by doing the following:
 409
 410 \footnotesize
 411 \begin{verbatim}
 412 pg_dump -c bacula > bacula.sql
 413 cat bacula.sql | psql bacula
 414 rm -f bacula.sql
 415 \end{verbatim}
 416 \normalsize
 417
 418 Depending on the size of your database, this will take more or less time and a
 419 fair amount of disk space. For example, you can {\bf cd} to the location of
 420 the Bacula database (typically /usr/local/pgsql/data or possible
 421 /var/lib/pgsql/data) and check the size.
 422
 423 There are certain PostgreSQL users who do not recommend the above
 424 procedure. They have the following to say:
 425 PostgreSQL does not
 426 need to be dumped/restored to keep the database efficient.  A normal
 427 process of vacuuming will prevent the database from every getting too
 428 large.  If you want to fine-tweak the database storage, commands such
 429 as VACUUM FULL, REINDEX, and CLUSTER exist specifically to keep you
 430 from having to do a dump/restore.
 431
 432 Finally, you might want to look at the PostgreSQL documentation on
 433 this subject at
 434 \elink{http://www.postgresql.org/docs/8.1/interactive/maintenance.html}
 435 {http://www.postgresql.org/docs/8.1/interactive/maintenance.html}.
 436
 437 \subsection*{Compacting Your SQLite Database}
 438 \index[general]{Compacting Your SQLite Database }
 439 \index[general]{Database!Compacting Your SQLite }
 440 \addcontentsline{toc}{subsection}{Compacting Your SQLite Database}
 441
 442 First please read the previous section that explains why it is necessary to
 443 compress a database. SQLite version 2.8.4 and greater have the {\bf Vacuum}
 444 command for compacting the database.
 445
 446 \footnotesize
 447 \begin{verbatim}
 448 cd {\bf working-directory}
 449 echo 'vacuum;' | sqlite bacula.db
 450 \end{verbatim}
 451 \normalsize
 452
 453 As an alternative, you can use the following commands, adapted to your system:
 454
 455
 456 \footnotesize
 457 \begin{verbatim}
 458 cd {\bf working-directory}
 459 echo '.dump' | sqlite bacula.db > bacula.sql
 460 rm -f bacula.db
 461 sqlite bacula.db < bacula.sql
 462 rm -f bacula.sql
 463 \end{verbatim}
 464 \normalsize
 465
 466 Where {\bf working-directory} is the directory that you specified in the
 467 Director's configuration file. Note, in the case of SQLite, it is necessary to
 468 completely delete (rm) the old database before creating a new compressed
 469 version.
 470
 471 \subsection*{Migrating from SQLite to MySQL}
 472 \index[general]{MySQL!Migrating from SQLite to }
 473 \index[general]{Migrating from SQLite to MySQL }
 474 \addcontentsline{toc}{subsection}{Migrating from SQLite to MySQL}
 475
 476 You may begin using Bacula with SQLite then later find that you want to switch
 477 to MySQL for any of a number of reasons: SQLite tends to use more disk than
 478 MySQL; when the database is corrupted it is often more catastrophic than
 479 with MySQL or PostgreSQL.
 480 Several users have succeeded in converting from SQLite to MySQL by
 481 exporting the MySQL data and then processing it with Perl scripts
 482 prior to putting it into MySQL. This is, however, not a simple
 483 process.
 484
 485 \label{BackingUpBacula}
 486 \subsection*{Backing Up Your Bacula Database}
 487 \index[general]{Backing Up Your Bacula Database }
 488 \index[general]{Database!Backing Up Your Bacula }
 489 \addcontentsline{toc}{subsection}{Backing Up Your Bacula Database}
 490
 491 If ever the machine on which your Bacula database crashes, and you need to
 492 restore from backup tapes, one of your first priorities will probably be to
 493 recover the database. Although Bacula will happily backup your catalog
 494 database if it is specified in the FileSet, this is not a very good way to do
 495 it, because the database will be saved while Bacula is modifying it. Thus the
 496 database may be in an instable state. Worse yet, you will backup the database
 497 before all the Bacula updates have been applied.
 498
 499 To resolve these problems, you need to backup the database after all the backup
 500 jobs have been run. In addition, you will want to make a copy while Bacula is
 501 not modifying it. To do so, you can use two scripts provided in the release
 502 {\bf make\_catalog\_backup} and {\bf delete\_catalog\_backup}. These files
 503 will be automatically generated along with all the other Bacula scripts. The
 504 first script will make an ASCII copy of your Bacula database into {\bf
 505 bacula.sql} in the working directory you specified in your configuration, and
 506 the second will delete the {\bf bacula.sql} file.
 507
 508 The basic sequence of events to make this work correctly is as follows:
 509
 510 \begin{itemize}
 511 \item Run all your nightly backups
 512 \item After running your nightly backups, run a Catalog backup Job
 513 \item The Catalog backup job must be scheduled after your last nightly backup
 514
 515 \item You use {\bf RunBeforeJob} to create the ASCII  backup file and {\bf
 516    RunAfterJob} to clean up
 517 \end{itemize}
 518
 519 Assuming that you start all your nightly backup jobs at 1:05 am (and that they
 520 run one after another), you can do the catalog backup with the following
 521 additional Director configuration statements:
 522
 523 \footnotesize
 524 \begin{verbatim}
 525 # Backup the catalog database (after the nightly save)
 526 Job {
 527   Name = "BackupCatalog"
 528   Type = Backup
 529   Client=rufus-fd
 530   FileSet="Catalog"
 531   Schedule = "WeeklyCycleAfterBackup"
 532   Storage = DLTDrive
 533   Messages = Standard
 534   Pool = Default
 535   RunBeforeJob = "/home/kern/bacula/bin/make_catalog_backup"
 536   RunAfterJob  = "/home/kern/bacula/bin/delete_catalog_backup"
 537   Write Bootstrap = "/home/kern/bacula/working/BackupCatalog.bsr"
 538 }
 539 # This schedule does the catalog. It starts after the WeeklyCycle
 540 Schedule {
 541   Name = "WeeklyCycleAfterBackup
 542   Run = Full sun-sat at 1:10
 543 }
 544 # This is the backup of the catalog
 545 FileSet {
 546   Name = "Catalog"
 547   Include {
 548     Options {
 549       signature=MD5
 550     }
 551     File = \lt{}working_directory\gt{}/bacula.sql
 552   }
 553 }
 554 \end{verbatim}
 555 \normalsize
 556
 557 Be sure to write a bootstrap file as in the above example. However, it is preferable
 558 to write or copy the bootstrap file to another computer. It will allow
 559 you to quickly recover the database backup should that be necessary.  If
 560 you do not have a bootstrap file, it is still possible to recover your
 561 database backup, but it will be more work and take longer.
 562
 563 \label{BackingUPOtherDBs}
 564 \subsection*{Backing Up Third Party Databases}
 565 \index[general]{Backing Up Third Party Databases }
 566 \index[general]{Databases!Backing Up Third Party }
 567 \addcontentsline{toc}{subsection}{Backing Up Third Party Databases}
 568
 569 If you are running a database in production mode on your machine, Bacula will
 570 happily backup the files, but if the database is in use while Bacula is
 571 reading it, you may back it up in an unstable state.
 572
 573 The best solution is to shutdown your database before backing it up, or use
 574 some tool specific to your database to make a valid live copy perhaps by
 575 dumping the database in ASCII format. I am not a database expert, so I cannot
 576 provide you advice on how to do this, but if you are unsure about how to
 577 backup your database, you might try visiting the Backup Central site, which
 578 has been renamed Storage Mountain (www.backupcentral.com). In particular,
 579 their
 580 \elink{ Free Backup and Recovery
 581 Software}{http://www.backupcentral.com/toc-free-backup-software.html} page has
 582 links to scripts that show you how to shutdown and backup most major
 583 databases.
 584 \label{Size}
 585
 586 \subsection*{Database Size}
 587 \index[general]{Size!Database }
 588 \index[general]{Database Size }
 589 \addcontentsline{toc}{subsection}{Database Size}
 590
 591 As mentioned above, if you do not do automatic pruning, your Catalog will grow
 592 each time you run a Job. Normally, you should decide how long you want File
 593 records to be maintained in the Catalog and set the {\bf File Retention}
 594 period to that time. Then you can either wait and see how big your Catalog
 595 gets or make a calculation assuming approximately 154 bytes for each File
 596 saved and knowing the number of Files that are saved during each backup and
 597 the number of Clients you backup.
 598
 599 For example, suppose you do a backup of two systems, each with 100,000 files.
 600 Suppose further that you do a Full backup weekly and an Incremental every day,
 601 and that the Incremental backup typically saves 4,000 files. The size of your
 602 database after a month can roughly be calculated as:
 603
 604 \footnotesize
 605 \begin{verbatim}
 606    Size = 154 * No. Systems * (100,000 * 4 + 10,000 * 26)
 607 \end{verbatim}
 608 \normalsize
 609
 610 where we have assumed 4 weeks in a month and 26 incremental backups per month.
 611 This would give the following:
 612
 613 \footnotesize
 614 \begin{verbatim}
 615    Size = 154 * 2 * (100,000 * 4 + 10,000 * 26)
 616 or
 617    Size = 308 * (400,000 + 260,000)
 618 or
 619    Size = 203,280,000 bytes
 620 \end{verbatim}
 621 \normalsize
 622
 623 So for the above two systems, we should expect to have a database size of
 624 approximately 200 Megabytes. Of course, this will vary according to how many
 625 files are actually backed up.
 626
 627 Below are some statistics for a MySQL database containing Job records for five
 628 Clients beginning September 2001 through May 2002 (8.5 months) and File
 629 records for the last 80 days. (Older File records have been pruned). For these
 630 systems, only the user files and system files that change are backed up. The
 631 core part of the system is assumed to be easily reloaded from the RedHat rpms.
 632
 633
 634 In the list below, the files (corresponding to Bacula Tables) with the
 635 extension .MYD contain the data records whereas files with the extension .MYI
 636 contain indexes.
 637
 638 You will note that the File records (containing the file attributes) make up
 639 the large bulk of the number of records as well as the space used (459 Mega
 640 Bytes including the indexes). As a consequence, the most important Retention
 641 period will be the {\bf File Retention} period. A quick calculation shows that
 642 for each File that is saved, the database grows by approximately 150 bytes.
 643
 644 \footnotesize
 645 \begin{verbatim}
 646       Size in
 647        Bytes   Records    File
 648  ============  =========  ===========
 649           168          5  Client.MYD
 650         3,072             Client.MYI
 651   344,394,684  3,080,191  File.MYD
 652   115,280,896             File.MYI
 653     2,590,316    106,902  Filename.MYD
 654     3,026,944             Filename.MYI
 655           184          4  FileSet.MYD
 656         2,048             FileSet.MYI
 657        49,062      1,326  JobMedia.MYD
 658        30,720             JobMedia.MYI
 659       141,752      1,378  Job.MYD
 660        13,312             Job.MYI
 661         1,004         11  Media.MYD
 662         3,072             Media.MYI
 663     1,299,512     22,233  Path.MYD
 664       581,632             Path.MYI
 665            36          1  Pool.MYD
 666         3,072             Pool.MYI
 667             5          1  Version.MYD
 668         1,024             Version.MYI
 669 \end{verbatim}
 670 \normalsize
 671
 672 This database has a total size of approximately 450 Megabytes.
 673
 674 If we were using SQLite, the determination of the total database size would be
 675 much easier since it is a single file, but we would have less insight to the
 676 size of the individual tables as we have in this case.
 677
 678 Note, SQLite databases may be as much as 50\% larger than MySQL databases due
 679 to the fact that all data is stored as ASCII strings. That is even binary
 680 integers are stored as ASCII strings, and this seems to increase the space
 681 needed.