git.sur5r.net Git - cc65/blob - doc/ld65.sgml

   1 <!doctype linuxdoc system>      <!-- -*- text-mode -*- -->
   2
   3 <article>
   4 <title>ld65 Users Guide
   5 <author><url url="mailto:uz@cc65.org" name="Ullrich von Bassewitz">
   6
   7 <abstract>
   8 The ld65 linker combines object files into an executable file. ld65 is highly
   9 configurable and uses configuration files for high flexibility.
  10 </abstract>
  11
  12 <!-- Table of contents -->
  13 <toc>
  14
  15 <!-- Begin the document -->
  16
  17 <sect>Overview<p>
  18
  19 The ld65 linker combines several object modules created by the ca65
  20 assembler, producing an executable file. The object modules may be read
  21 from a library created by the ar65 archiver (this is somewhat faster and
  22 more convenient). The linker was designed to be as flexible as possible.
  23 It complements the features that are built into the ca65 macroassembler:
  24
  25 <itemize>
  26
  27 <item>  Accept any number of segments to form an executable module.
  28
  29 <item>  Resolve arbitrary expressions stored in the object files.
  30
  31 <item>  In case of errors, use the meta information stored in the object files
  32         to produce helpful error messages. In case of undefined symbols,
  33         expression range errors, or symbol type mismatches, ld65 is able to
  34         tell you the exact location in the original assembler source, where
  35         the symbol was referenced.
  36
  37 <item>  Flexible output. The output of ld65 is highly configurable by a config
  38         file. Some more-common platforms are supported by default configurations
  39         that may be activated by naming the target system. The output
  40         generation was designed with different output formats in mind, so
  41         adding other formats shouldn't be a great problem.
  42
  43 </itemize>
  44
  45
  46 <sect>Usage<p>
  47
  48
  49 <sect1>Command-line option overview<p>
  50
  51 The linker is called as follows:
  52
  53 <tscreen><verb>
  54 ---------------------------------------------------------------------------
  55 Usage: ld65 [options] module ...
  56 Short options:
  57   -(                    Start a library group
  58   -)                    End a library group
  59   -C name               Use linker config file
  60   -D sym=val            Define a symbol
  61   -L path               Specify a library search path
  62   -Ln name              Create a VICE label file
  63   -S addr               Set the default start address
  64   -V                    Print the linker version
  65   -h                    Help (this text)
  66   -m name               Create a map file
  67   -o name               Name the default output file
  68   -t sys                Set the target system
  69   -u sym                Force an import of symbol 'sym'
  70   -v                    Verbose mode
  71   -vm                   Verbose map file
  72
  73 Long options:
  74   --allow-multiple-definition   Allow multiple definitions
  75   --cfg-path path               Specify a config file search path
  76   --config name                 Use linker config file
  77   --dbgfile name                Generate debug information
  78   --define sym=val              Define a symbol
  79   --end-group                   End a library group
  80   --force-import sym            Force an import of symbol 'sym'
  81   --help                        Help (this text)
  82   --lib file                    Link this library
  83   --lib-path path               Specify a library search path
  84   --mapfile name                Create a map file
  85   --module-id id                Specify a module id
  86   --obj file                    Link this object file
  87   --obj-path path               Specify an object file search path
  88   --start-addr addr             Set the default start address
  89   --start-group                 Start a library group
  90   --target sys                  Set the target system
  91   --version                     Print the linker version
  92 ---------------------------------------------------------------------------
  93 </verb></tscreen>
  94
  95
  96 <sect1>Command-line options in detail<p>
  97
  98 Here is a description of all of the command-line options:
  99
 100 <descrip>
 101
 102   <tag><tt>--allow-multiple-definition</tt></tag>
 103
 104   Normally when a global symbol is defined multiple times, ld65 will
 105   issue an error and not create the output file. This option lets it
 106   silently ignore this fact and continue. The first definition of a
 107   symbol will be used.
 108
 109
 110   <label id="option--start-group">
 111   <tag><tt>-(, --start-group</tt></tag>
 112
 113   Start a library group. The libraries specified within a group are searched
 114   multiple times to resolve crossreferences within the libraries. Normally,
 115   crossreferences are resolved only within a library, that is the library is
 116   searched multiple times. Libraries specified later on the command line
 117   cannot reference otherwise unreferenced symbols in libraries specified
 118   earlier, because the linker has already handled them. Library groups are
 119   a solution for this problem, because the linker will search repeatedly
 120   through all libraries specified in the group, until all possible open
 121   symbol references have been satisfied.
 122
 123
 124   <tag><tt>-), --end-group</tt></tag>
 125
 126   End a library group. See the explanation of the <tt><ref
 127   id="option--start-group" name="--start-group"></tt> option.
 128
 129
 130   <tag><tt>-h, --help</tt></tag>
 131
 132   Print the short option summary shown above.
 133
 134
 135   <label id="option-m">
 136   <tag><tt>-m name, --mapfile name</tt></tag>
 137
 138   This option (which needs an argument that will used as a filename for
 139   the generated map file) will cause the linker to generate a map file.
 140   The map file does contain a detailed overview over the modules used, the
 141   sizes for the different segments, and a table containing exported
 142   symbols.
 143
 144
 145   <label id="option-o">
 146   <tag><tt>-o name</tt></tag>
 147
 148   The -o switch is used to give the name of the default output file.
 149   Depending on your output configuration, this name <em/might not/ be used as the
 150   name for the output file. However, for the default configurations, this
 151   name is used for the output file name.
 152
 153
 154   <label id="option-t">
 155   <tag><tt>-t sys, --target sys</tt></tag>
 156
 157   The argument for the -t switch is the name of the target system. Since this
 158   switch will activate a default configuration, it may not be used together
 159   with the <tt><ref id="option-C" name="-C"></tt> option. The following target
 160   systems are currently supported:
 161
 162   <itemize>
 163   <item>none
 164   <item>module
 165   <item>apple2
 166   <item>apple2enh
 167   <item>atari2600
 168   <item>atari
 169   <item>atarixl
 170   <item>atmos
 171   <item>c16 (works also for the c116 with memory up to 32K)
 172   <item>c64
 173   <item>c128
 174   <item>cbm510 (CBM-II series with 40-column video)
 175   <item>cbm610 (all CBM series-II computers with 80-column video)
 176   <item>geos-apple
 177   <item>geos-cbm
 178   <item>lunix
 179   <item>lynx
 180   <item>nes
 181   <item>pet (all CBM PET systems except the 2001)
 182   <item>plus4
 183   <item>sim6502
 184   <item>sim65c02
 185   <item>supervision
 186   <item>telestrat
 187   <item>vic20
 188   </itemize>
 189
 190   There are a few more targets defined but neither of them is actually
 191   supported.
 192
 193
 194   <tag><tt>-u sym[:addrsize], --force-import sym[:addrsize]</tt></tag>
 195
 196   Force an import of a symbol. While object files are always linked to the
 197   output file, regardless if there are any references, object modules from
 198   libraries get only linked in if an import can be satisfied by this module.
 199   The <tt/--force-import/ option may be used to add a reference to a symbol and
 200   as a result force linkage of the module that exports the identifier.
 201
 202   The name of the symbol may optionally be followed by a colon and an address-size
 203   specifier. If no address size is specified, the default address size
 204   for the target machine is used.
 205
 206   Please note that the symbol name needs to have the internal representation,
 207   meaning you have to prepend an underscore for C identifiers.
 208
 209
 210   <label id="option-v">
 211   <tag><tt>-v, --verbose</tt></tag>
 212
 213   Using the -v option, you may enable more output that may help you to
 214   locate problems. If an undefined symbol is encountered, -v causes the
 215   linker to print a detailed list of the references (that is, source file
 216   and line) for this symbol.
 217
 218
 219   <tag><tt>-vm</tt></tag>
 220
 221   Must be used in conjunction with <tt><ref id="option-m" name="-m"></tt>
 222   (generate map file). Normally the map file will not include empty segments
 223   and sections, or unreferenced symbols. Using this option, you can force the
 224   linker to include all that information into the map file.  Also, it will
 225   include a second <tt/Exports/ list.  The first list is sorted by name;
 226   the second one is sorted by value.
 227
 228
 229   <label id="option-C">
 230   <tag><tt>-C</tt></tag>
 231
 232   This gives the name of an output config file to use. See section 4 for more
 233   information about config files. -C may not be used together with <tt><ref
 234   id="option-t" name="-t"></tt>.
 235
 236
 237   <label id="option-D">
 238   <tag><tt>-D sym=value, --define sym=value</tt></tag>
 239
 240   This option allows to define an external symbol on the command line. Value
 241   may start with a '&dollar;' sign or with <tt/0x/ for hexadecimal values,
 242   otherwise a leading zero denotes octal values. See also <ref
 243   id="SYMBOLS" name="the SYMBOLS section"> in the configuration file.
 244
 245
 246   <label id="option--lib-path">
 247   <tag><tt>-L path, --lib-path path</tt></tag>
 248
 249   Specify a library search path. This option may be used more than once. It
 250   adds a directory to the search path for library files. Libraries specified
 251   without a path are searched in the current directory, in the list of
 252   directories specified using <tt/--lib-path/, in directories given by
 253   environment variables, and in a built-in default directory.
 254
 255
 256   <tag><tt>-Ln</tt></tag>
 257
 258   This option allows you to create a file that contains all global labels and
 259   may be loaded into the VICE emulator using the <tt/ll/ (load label) command
 260   or into the Oricutron emulator using the <tt/sl/ (symbols load) command. You
 261   may use this to debug your code with VICE. Note: Older versions had some
 262   bugs in the label code. If you have problems, please get the latest <url
 263   url="http://vice-emu.sourceforge.net" name="VICE"> version.
 264
 265
 266   <label id="option-S">
 267   <tag><tt>-S addr, --start-addr addr</tt></tag>
 268
 269   Using -S you may define the default starting address. If and how this
 270   address is used depends on the config file in use. For the default
 271   configurations, only the "none", "apple2" and "apple2enh" systems honor an
 272   explicit start address, all other default configs provide their own.
 273
 274
 275   <tag><tt>-V, --version</tt></tag>
 276
 277   This option prints the version number of the linker. If you send any
 278   suggestions or bugfixes, please include this number.
 279
 280
 281   <label id="option--cfg-path">
 282   <tag><tt>--cfg-path path</tt></tag>
 283
 284   Specify a config file search path. This option may be used more than once.
 285   It adds a directory to the search path for config files. A config file given
 286   with the <tt><ref id="option-C" name="-C"></tt> option that has no path in
 287   its name is searched in the current directory, in the list of directories
 288   specified using <tt/--cfg-path/, in directories given by environment variables,
 289   and in a built-in default directory.
 290
 291
 292   <label id="option--dbgfile">
 293   <tag><tt>--dbgfile name</tt></tag>
 294
 295   Specify an output file for debug information. Available information will be
 296   written to this file. Using the <tt/-g/ option for the compiler and assembler
 297   will increase the amount of information available. Please note that debug
 298   information generation is currently being developed, so the format of the
 299   file and its contents are subject to change without further notice.
 300
 301
 302   <tag><tt>--lib file</tt></tag>
 303
 304   Links a library to the output. Use this command-line option instead of just
 305   naming the library file, if the linker is not able to determine the file
 306   type because of an unusual extension.
 307
 308
 309   <tag><tt>--obj file</tt></tag>
 310
 311   Links an object file to the output. Use this command-line option instead
 312   of just naming the object file, if the linker is not able to determine the
 313   file type because of an unusual extension.
 314
 315
 316   <label id="option--obj-path">
 317   <tag><tt>--obj-path path</tt></tag>
 318
 319   Specify an object file search path. This option may be used more than once.
 320   It adds a directory to the search path for object files. An object file
 321   passed to the linker that has no path in its name is searched in the current
 322   directory, in the list of directories specified using <tt/--obj-path/, in
 323   directories given by environment variables, and in a built-in default directory.
 324
 325 </descrip>
 326
 327
 328
 329 <sect>Search paths<p>
 330
 331 Starting with version 2.10, there are now several search-path lists for files needed
 332 by the linker: one for libraries, one for object files, and one for config
 333 files.
 334
 335
 336 <sect1>Library search path<p>
 337
 338 The library search-path list contains in this order:
 339
 340 <enum>
 341 <item>The current directory.
 342 <item>Any directory added with the <tt><ref id="option--lib-path"
 343       name="--lib-path"></tt> option on the command line.
 344 <item>The value of the environment variable <tt/LD65_LIB/ if it is defined.
 345 <item>A subdirectory named <tt/lib/ of the directory defined in the environment
 346       variable <tt/CC65_HOME/, if it is defined.
 347 <item>An optionally compiled-in library path.
 348 </enum>
 349
 350
 351 <sect1>Object file search path<p>
 352
 353 The object file search-path list contains in this order:
 354
 355 <enum>
 356 <item>The current directory.
 357 <item>Any directory added with the <tt><ref id="option--obj-path"
 358       name="--obj-path"></tt> option on the command line.
 359 <item>The value of the environment variable <tt/LD65_OBJ/ if it is defined.
 360 <item>A subdirectory named <tt/obj/ of the directory defined in the environment
 361       variable <tt/CC65_HOME/, if it is defined.
 362 <item>An optionally compiled-in directory.
 363 </enum>
 364
 365
 366 <sect1>Config file search path<p>
 367
 368 The config file search-path list contains in this order:
 369
 370 <enum>
 371 <item>The current directory.
 372 <item>Any directory added with the <tt><ref id="option--cfg-path"
 373       name="--cfg-path"></tt> option on the command line.
 374 <item>The value of the environment variable <tt/LD65_CFG/ if it is defined.
 375 <item>A subdirectory named <tt/cfg/ of the directory defined in the environment
 376       variable <tt/CC65_HOME/, if it is defined.
 377 <item>An optionally compiled-in directory.
 378 </enum>
 379
 380
 381
 382 <sect>Detailed workings<p>
 383
 384 The linker does several things when combining object modules:
 385
 386 First, the command line is parsed from left to right. For each object file
 387 encountered (object files are recognized by a magic word in the header, so
 388 the linker does not care about the name), imported and exported
 389 identifiers are read from the file and inserted in a table. If a library
 390 name is given (libraries are also recognized by a magic word, there are no
 391 special naming conventions), all modules in the library are checked if an
 392 export from this module would satisfy an import from other modules. All
 393 modules where this is the case are marked. If duplicate identifiers are
 394 found, the linker issues warnings.
 395
 396 That procedure (parsing and reading from left to right) does mean that a
 397 library may only satisfy references for object modules (given directly or from
 398 a library) named <em/before/ that library. With the command line
 399
 400 <tscreen><verb>
 401         ld65 crt0.o clib.lib test.o
 402 </verb></tscreen>
 403
 404 the module <tt/test.o/ must not contain references to modules in the library
 405 <tt/clib.lib/.  But, if it does, you have to change the order of the modules
 406 on the command line:
 407
 408 <tscreen><verb>
 409         ld65 crt0.o test.o clib.lib
 410 </verb></tscreen>
 411
 412 Step two is, to read the configuration file, and assign start addresses
 413 for the segments and define any linker symbols (see <ref id="config-files"
 414 name="Configuration files">).
 415
 416 After that, the linker is ready to produce an output file. Before doing that,
 417 it checks its data for consistency. That is, it checks for unresolved
 418 externals (if the output format is not relocatable) and for symbol type
 419 mismatches (for example a zero-page symbol is imported by a module as an absolute
 420 symbol).
 421
 422 Step four is, to write the actual target files. In this step, the linker will
 423 resolve any expressions contained in the segment data. Circular references are
 424 also detected in this step (a symbol may have a circular reference that goes
 425 unnoticed if the symbol is not used).
 426
 427 Step five is to output a map file with a detailed list of all modules,
 428 segments and symbols encountered.
 429
 430 And, last step, if you give the <tt><ref id="option-v" name="-v"></tt> switch
 431 twice, you get a dump of the segment data. However, this may be quite
 432 unreadable if you're not a developer. :-)
 433
 434
 435
 436 <sect>Configuration files<label id="config-files"><p>
 437
 438 Configuration files are used to describe the layout of the output file(s). Two
 439 major topics are covered in a config file: The memory layout of the target
 440 architecture, and the assignment of segments to memory areas. In addition,
 441 several other attributes may be specified.
 442
 443 Case is ignored for keywords, that is, section or attribute names, but it is
 444 <em/not/ ignored for names and strings.
 445
 446
 447
 448 <sect1>Memory areas<p>
 449
 450 Memory areas are specified in a <tt/MEMORY/ section. Let's have a look at an
 451 example (this one describes the usable memory layout of the C64):
 452
 453 <tscreen><verb>
 454         MEMORY {
 455             RAM1:  start = $0800, size = $9800;
 456             ROM1:  start = $A000, size = $2000;
 457             RAM2:  start = $C000, size = $1000;
 458             ROM2:  start = $E000, size = $2000;
 459         }
 460 </verb></tscreen>
 461
 462 As you can see, there are two RAM areas and two ROM areas. The names
 463 (before the colon) are arbitrary names that must start with a letter, with
 464 the remaining characters being letters or digits. The names of the memory
 465 areas are used when assigning segments. As mentioned above, case is
 466 significant for those names.
 467
 468 The syntax above is used in all sections of the config file. The name
 469 (<tt/ROM1/ etc.) is said to be an identifier, the remaining tokens up to the
 470 semicolon specify attributes for this identifier. You may use the equal sign
 471 to assign values to attributes, and you may use a comma to separate
 472 attributes, you may also leave both out. But you <em/must/ use a semicolon to
 473 mark the end of the attributes for one identifier. The section above may also
 474 have looked like this:
 475
 476 <tscreen><verb>
 477         # Start of memory section
 478         MEMORY
 479         {
 480             RAM1:
 481                 start $0800
 482                 size $9800;
 483             ROM1:
 484                 start $A000
 485                 size $2000;
 486             RAM2:
 487                 start $C000
 488                 size $1000;
 489             ROM2:
 490                 start $E000
 491                 size $2000;
 492         }
 493 </verb></tscreen>
 494
 495 There are of course more attributes for a memory section than just start and
 496 size. Start and size are mandatory attributes, that means, each memory area
 497 defined <em/must/ have these attributes given (the linker will check that). I
 498 will cover other attributes later. As you may have noticed, I've used a
 499 comment in the example above. Comments start with a hash mark ('#'), the
 500 remainder of the line is ignored if this character is found.
 501
 502
 503 <sect1>Segments<p>
 504
 505 Let's assume you have written a program for your trusty old C64, and you would
 506 like to run it. For testing purposes, it should run in the <tt/RAM/ area. So
 507 we will start to assign segments to memory sections in the <tt/SEGMENTS/
 508 section:
 509
 510 <tscreen><verb>
 511         SEGMENTS {
 512             CODE:   load = RAM1, type = ro;
 513             RODATA: load = RAM1, type = ro;
 514             DATA:   load = RAM1, type = rw;
 515             BSS:    load = RAM1, type = bss, define = yes;
 516         }
 517 </verb></tscreen>
 518
 519 What we are doing here is telling the linker, that all segments go into the
 520 <tt/RAM1/ memory area in the order specified in the <tt/SEGMENTS/ section. So
 521 the linker will first write the <tt/CODE/ segment, then the <tt/RODATA/
 522 segment, then the <tt/DATA/ segment - but it will not write the <tt/BSS/
 523 segment. Why? Here enters the segment type: For each segment specified, you may also
 524 specify a segment attribute. There are five possible segment attributes:
 525
 526 <tscreen><verb>
 527         ro          means readonly
 528         rw          means read/write
 529         bss         means that this is an uninitialized segment
 530         zp          a zeropage segment
 531         overwrite   a segment that overwrites (parts of) another one
 532
 533 </verb></tscreen>
 534
 535 So, because we specified that the segment with the name BSS is of type bss,
 536 the linker knows that this is uninitialized data, and will not write it to an
 537 output file. This is an important point: For the assembler, the <tt/BSS/
 538 segment has no special meaning. You specify, which segments have the bss
 539 attribute when linking. This approach is much more flexible than having one
 540 fixed bss segment, and is a result of the design decision to supporting an
 541 arbitrary segment count.
 542
 543 If you specify "<tt/type = bss/" for a segment, the linker will make sure that
 544 this segment does only contain uninitialized data (that is, zeroes), and issue
 545 a warning if this is not the case.
 546
 547 For a <tt/bss/ type segment to be useful, it must be cleared somehow by your
 548 program (this happens usually in the startup code - for example the startup
 549 code for cc65-generated programs takes care about that). But how does your
 550 code know, where the segment starts, and how big it is? The linker is able to
 551 give that information, but you must request it. This is, what we're doing with
 552 the "<tt/define = yes/" attribute in the <tt/BSS/ definitions. For each
 553 segment, where this attribute is true, the linker will export three symbols.
 554
 555 <tscreen><verb>
 556         __NAME_LOAD__   This is set to the address where the
 557                         segment is loaded.
 558         __NAME_RUN__    This is set to the run address of the
 559                         segment. We will cover run addresses
 560                         later.
 561         __NAME_SIZE__   This is set to the segment size.
 562 </verb></tscreen>
 563
 564 Replace <tt/NAME/ by the name of the segment, in the example above, this would
 565 be <tt/BSS/. These symbols may be accessed by your code.
 566
 567 Now, as we've configured the linker to write the first three segments and
 568 create symbols for the last one, there's only one question left: Where does
 569 the linker put the data? It would be very convenient to have the data in a
 570 file, wouldn't it?
 571
 572 <sect1>Output files<p>
 573
 574 We don't have any files specified above, and indeed, this is not needed in a
 575 simple configuration like the one above. There is an additional attribute
 576 "file" that may be specified for a memory area, that gives a file name to
 577 write the area data into. If there is no file name given, the linker will
 578 assign the default file name. This is "a.out" or the one given with the
 579 <tt><ref id="option-o" name="-o"></tt> option on the command line. Since the
 580 default behaviour is OK for our purposes, I did not use the attribute in the
 581 example above. Let's have a look at it now.
 582
 583 The "file" attribute (the keyword may also be written as "FILE" if you like
 584 that better) takes a string enclosed in double quotes ('&dquot;') that specifies the
 585 file, where the data is written. You may specify the same file several times,
 586 in that case the data for all memory areas having this file name is written
 587 into this file, in the order of the memory areas defined in the <tt/MEMORY/
 588 section. Let's specify some file names in the <tt/MEMORY/ section used above:
 589
 590 <tscreen><verb>
 591         MEMORY {
 592             RAM1:  start = $0800, size = $9800, file = %O;
 593             ROM1:  start = $A000, size = $2000, file = "rom1.bin";
 594             RAM2:  start = $C000, size = $1000, file = %O;
 595             ROM2:  start = $E000, size = $2000, file = "rom2.bin";
 596         }
 597 </verb></tscreen>
 598
 599 The <tt/%O/ used here is a way to specify the default behaviour explicitly:
 600 <tt/%O/ is replaced by a string (including the quotes) that contains the
 601 default output name, that is, "a.out" or the name specified with the <tt><ref
 602 id="option-o" name="-o"></tt> option on the command line. Into this file, the
 603 linker will first write any segments that go into <tt/RAM1/, and will append
 604 then the segments for <tt/RAM2/, because the memory areas are given in this
 605 order. So, for the RAM areas, nothing has really changed.
 606
 607 We've not used the ROM areas, but we will do that below, so we give the file
 608 names here. Segments that go into <tt/ROM1/ will be written to a file named
 609 "rom1.bin", and segments that go into <tt/ROM2/ will be written to a file
 610 named "rom2.bin". The name given on the command line is ignored in both cases.
 611
 612 Assigning an empty file name for a memory area will discard the data written
 613 to it. This is useful, if the memory area has segments assigned that are empty
 614 (for example because they are of type bss). In that case, the linker will
 615 create an empty output file. This may be suppressed by assigning an empty file
 616 name to that memory area.
 617
 618 The <tt/%O/ sequence is also allowed inside a string. So using
 619
 620 <tscreen><verb>
 621         MEMORY {
 622             ROM1:  start = $A000, size = $2000, file = "%O-1.bin";
 623             ROM2:  start = $E000, size = $2000, file = "%O-2.bin";
 624         }
 625 </verb></tscreen>
 626
 627 would write two files that start with the name of the output file specified on
 628 the command line, with "-1.bin" and "-2.bin" appended respectively. Because
 629 '%' is used as an escape char, the sequence "%%" has to be used if a single
 630 percent sign is required.
 631
 632 <sect1>OVERWRITE segments<p>
 633
 634 There are situations when you may wish to overwrite some part (or parts) of a
 635 segment with another one. Perhaps you are modifying an OS ROM that has its
 636 public subroutines at fixed, well-known addresses, and you want to prevent them
 637 from shifting to other locations in memory if your changed code takes less
 638 space. Or you are updating a block of code available in binary-only form with
 639 fixes that are scattered in various places. Generally, whenever you want to
 640 minimize disturbance to an existing code brought on by your updates, OVERWRITE
 641 segments are worth considering.
 642
 643 Here is an example:
 644
 645 <tscreen><verb>
 646 MEMORY {
 647     RAM: file = "", start = $6000, size = $2000, type=rw;
 648     ROM: file = %O, start = $8000, size = $8000, type=ro;
 649 }
 650 </verb></tscreen>
 651
 652 Nothing unusual so far, just two memory blocks - one RAM, one ROM. Now let's
 653 look at the segment configuration:
 654
 655 <tscreen><verb>
 656 SEGMENTS {
 657     RAM:       load = RAM, type = bss;
 658     ORIGINAL:  load = ROM, type = ro;
 659     FASTCOPY:  load = ROM, start=$9000, type = overwrite;
 660     JMPPATCH1: load = ROM, start=$f7e8, type = overwrite;
 661     DEBUG:     load = ROM, start=$8000, type = overwrite;
 662     VERSION:   load = ROM, start=$e5b7, type = overwrite;
 663 }
 664 </verb></tscreen>
 665
 666 Segment named ORIGINAL contains the original code, disassembled or provided in
 667 a binary form (i.e. using <tt/.INCBIN/ directive; see the <tt/ca65/ assembler
 668 document).  Subsequent four segments will be relocated to addresses specified
 669 by their "start" attributes ("offset" can also be used) and then will overwrite
 670 whatever was at these locations in the ORIGINAL segment. In the end, resulting
 671 binary output file will thus contain original data with the exception of four
 672 sequences starting at $9000, $f7e8, $8000 and $e5b7, which will sport code from
 673 their respective segments. How long these sequences will be depends on the
 674 lengths of corresponding segments - they can even overlap, so think what you're
 675 doing.
 676
 677 Finally, note that OVERWRITE segments should be the final segments loaded to a
 678 particular memory area, and that they need at least one of "start" or "offset"
 679 attributes specified.
 680
 681 <sect1>LOAD and RUN addresses (ROMable code)<p>
 682
 683 Let us look now at a more complex example. Say, you've successfully tested
 684 your new "Super Operating System" (SOS for short) for the C64, and you
 685 will now go and replace the ROMs by your own code. When doing that, you
 686 face a new problem: If the code runs in RAM, we need not to care about
 687 read/write data. But now, if the code is in ROM, we must care about it.
 688 Remember the default segments (you may of course specify your own):
 689
 690 <tscreen><verb>
 691         CODE            read-only code
 692         RODATA          read-only data
 693         DATA            read/write data
 694         BSS             uninitialized data, read/write
 695 </verb></tscreen>
 696
 697 Since <tt/BSS/ is not initialized, we must not care about it now, but what
 698 about <tt/DATA/? <tt/DATA/ contains initialized data, that is, data that was
 699 explicitly assigned a value. And your program will rely on these values on
 700 startup. Since there's no way to remember the contents of the data segment,
 701 other than storing it into one of the ROMs, we have to put it there. But
 702 unfortunately, ROM is not writable, so we have to copy it into RAM before
 703 running the actual code.
 704
 705 The linker won't copy the data from ROM into RAM for you (this must be done by
 706 the startup code of your program), but it has some features that will help you
 707 in this process.
 708
 709 First, you may not only specify a "<tt/load/" attribute for a segment, but
 710 also a "<tt/run/" attribute. The "<tt/load/" attribute is mandatory, and, if
 711 you don't specify a "<tt/run/" attribute, the linker assumes that load area
 712 and run area are the same. We will use this feature for our data area:
 713
 714 <tscreen><verb>
 715         SEGMENTS {
 716             CODE:   load = ROM1, type = ro;
 717             RODATA: load = ROM2, type = ro;
 718             DATA:   load = ROM2, run = RAM2, type = rw, define = yes;
 719             BSS:    load = RAM2, type = bss, define = yes;
 720         }
 721 </verb></tscreen>
 722
 723 Let's have a closer look at this <tt/SEGMENTS/ section. We specify that the
 724 <tt/CODE/ segment goes into <tt/ROM1/ (the one at $A000). The readonly data
 725 goes into <tt/ROM2/. Read/write data will be loaded into <tt/ROM2/ but is run
 726 in <tt/RAM2/. That means that all references to labels in the <tt/DATA/
 727 segment are relocated to be in <tt/RAM2/, but the segment is written to
 728 <tt/ROM2/. All your startup code has to do is, to copy the data from its
 729 location in <tt/ROM2/ to the final location in <tt/RAM2/.
 730
 731 So, how do you know, where the data is located? This is the second point,
 732 where you get help from the linker. Remember the "<tt/define/" attribute?
 733 Since we have set this attribute to true, the linker will define three
 734 external symbols for the data segment that may be accessed from your code:
 735
 736 <tscreen><verb>
 737         __DATA_LOAD__   This is set to the address where the segment
 738                         is loaded, in this case, it is an address in
 739                         ROM2.
 740         __DATA_RUN__    This is set to the run address of the segment,
 741                         in this case, it is an address in RAM2.
 742         __DATA_SIZE__   This is set to the segment size.
 743 </verb></tscreen>
 744
 745 So, what your startup code must do, is to copy <tt/__DATA_SIZE__/ bytes from
 746 <tt/__DATA_LOAD__/ to <tt/__DATA_RUN__/ before any other routines are called.
 747 All references to labels in the <tt/DATA/ segment are relocated to <tt/RAM2/
 748 by the linker, so things will work properly.
 749
 750 There's a library subroutine called <tt/copydata/ (in a module named
 751 <tt/copydata.s/) that might be used to do actual copying. Be sure to have a
 752 look at it's inner workings before using it!
 753
 754
 755 <sect1>Other MEMORY area attributes<p>
 756
 757 There are some other attributes not covered above. Before starting the
 758 reference section, I will discuss the remaining things here.
 759
 760 You may request symbols definitions also for memory areas. This may be
 761 useful for things like a software stack, or an I/O area.
 762
 763 <tscreen><verb>
 764         MEMORY {
 765             STACK:  start = $C000, size = $1000, define = yes;
 766         }
 767 </verb></tscreen>
 768
 769 This will define some external symbols that may be used in your code:
 770
 771 <tscreen><verb>
 772         __STACK_START__         This is set to the start of the memory
 773                                 area, $C000 in this example.
 774         __STACK_SIZE__          The size of the area, here $1000.
 775         __STACK_LAST__          This is NOT the same as START+SIZE.
 776                                 Instead, it is defined as the first
 777                                 address that is not used by data. If we
 778                                 don't define any segments for this area,
 779                                 the value will be the same as START.
 780         __STACK_FILEOFFS__      The binary offset in the output file. This
 781                                 is not defined for relocatable output file
 782                                 formats (o65).
 783 </verb></tscreen>
 784
 785 A memory section may also have a type. Valid types are
 786
 787 <tscreen><verb>
 788         ro      for readonly memory
 789         rw      for read/write memory.
 790 </verb></tscreen>
 791
 792 The linker will assure, that no segment marked as read/write or bss is put
 793 into a memory area that is marked as readonly.
 794
 795 Unused memory in a memory area may be filled. Use the "<tt/fill = yes/"
 796 attribute to request this. The default value to fill unused space is zero. If
 797 you don't like this, you may specify a byte value that is used to fill these
 798 areas with the "<tt/fillval/" attribute. If there is no "<tt/fillval/"
 799 attribute for the segment, the "<tt/fillval/" attribute of the memory area (or
 800 its default) is used instead. This means that the value may also be used to
 801 fill unfilled areas generated by the assembler's <tt/.ALIGN/ and <tt/.RES/
 802 directives.
 803
 804 The symbol <tt/%S/ may be used to access the default start address (that is,
 805 the one defined in <ref id="FEATURES" name="the FEATURES section">, or the
 806 value given on the command line with the <tt><ref id="option-S" name="-S"></tt>
 807 option).
 808
 809 To support systems with banked memory, a special attribute named <tt/bank/ is
 810 available. The attribute value is an arbitrary 32-bit integer. The assembler
 811 has a builtin function named <tt/.BANK/ which may be used with an argument
 812 that has a segment reference (for example a symbol). The result of this
 813 function is the value of the bank attribute for the run memory area of the
 814 segment.
 815
 816
 817 <sect1>Other SEGMENT attributes<p>
 818
 819 Segments may be aligned to some memory boundary. Specify "<tt/align = num/" to
 820 request this feature. To align all segments on a page boundary, use
 821
 822 <tscreen><verb>
 823         SEGMENTS {
 824             CODE:   load = ROM1, type = ro, align = $100;
 825             RODATA: load = ROM2, type = ro, align = $100;
 826             DATA:   load = ROM2, run = RAM2, type = rw, define = yes,
 827                     align = $100;
 828             BSS:    load = RAM2, type = bss, define = yes, align = $100;
 829         }
 830 </verb></tscreen>
 831
 832 If an alignment is requested, the linker will add enough space to the output
 833 file, so that the new segment starts at an address that is dividable by the
 834 given number without a remainder. All addresses are adjusted accordingly. To
 835 fill the unused space, bytes of zero are used, or, if the memory area has a
 836 "<tt/fillval/" attribute, that value. Alignment is always needed, if you have
 837 used the <tt/.ALIGN/ command in the assembler. The alignment of a segment
 838 must be equal or greater than the alignment used in the <tt/.ALIGN/ command.
 839 The linker will check that, and issue a warning, if the alignment of a segment
 840 is lower than the alignment requested in an <tt/.ALIGN/ command of one of the
 841 modules making up this segment.
 842
 843 For a given segment you may also specify a fixed offset into a memory area or
 844 a fixed start address. Use this if you want the code to run at a specific
 845 address (a prominent case is the interrupt vector table which must go at
 846 address $FFFA). Only one of <tt/ALIGN/ or <tt/OFFSET/ or <tt/START/ may be
 847 specified. If the directive creates empty space, it will be filled with zero,
 848 of with the value specified with the "<tt/fillval/" attribute if one is given.
 849 The linker will warn you if it is not possible to put the code at the
 850 specified offset (this may happen if other segments in this area are too
 851 large). Here's an example:
 852
 853 <tscreen><verb>
 854         SEGMENTS {
 855             VECTORS: load = ROM2, type = ro, start = $FFFA;
 856         }
 857 </verb></tscreen>
 858
 859 or (for the segment definitions from above)
 860
 861 <tscreen><verb>
 862         SEGMENTS {
 863             VECTORS: load = ROM2, type = ro, offset = $1FFA;
 864         }
 865 </verb></tscreen>
 866
 867 The "<tt/align/", "<tt/start/" and "<tt/offset/" attributes change placement
 868 of the segment in the run memory area, because this is what is usually
 869 desired. If load and run memory areas are equal (which is the case if only the
 870 load memory area has been specified), the attributes will also work. There is
 871 also an "<tt/align_load/" attribute that may be used to align the start of the
 872 segment in the load memory area, in case different load and run areas have
 873 been specified. There are no special attributes to set start or offset for
 874 just the load memory area.
 875
 876 A "<tt/fillval/" attribute may not only be specified for a memory area, but
 877 also for a segment. The value must be an integer between 0 and 255. It is used
 878 as the fill value for space reserved by the assembler's <tt/.ALIGN/ and <tt/.RES/
 879 commands. It is also used as the fill value for space between sections (part of a
 880 segment that comes from one object file) caused by alignment, but not for
 881 space that preceeds the first section.
 882
 883 To suppress the warning, the linker issues if it encounters a segment that is
 884 not found in any of the input files, use "<tt/optional=yes/" as an additional
 885 segment attribute. Be careful when using this attribute, because a missing
 886 segment may be a sign of a problem, and if you're suppressing the warning,
 887 there is no one left to tell you about it.
 888
 889 <sect1>The FILES section<p>
 890
 891 The <tt/FILES/ section is used to support other formats than straight binary
 892 (which is the default, so binary output files do not need an explicit entry
 893 in the <tt/FILES/ section).
 894
 895 The <tt/FILES/ section lists output files and as only attribute the format of
 896 each output file. Assigning binary format to the default output file would
 897 look like this:
 898
 899 <tscreen><verb>
 900         FILES {
 901             %O: format = bin;
 902         }
 903 </verb></tscreen>
 904
 905 There are two other available formats, one is the o65 format specified by Andre
 906 Fachat (see the <url url="http://www.6502.org/users/andre/o65/fileformat.html"
 907 name="6502 binary relocation format specification">). It is defined like this:
 908
 909 <tscreen><verb>
 910         FILES {
 911             %O: format = o65;
 912         }
 913 </verb></tscreen>
 914
 915 The other format available is the Atari (xex) segmented file format, this is
 916 the standard format used by Atari DOS 2.0 and upward file managers in the Atari
 917 8-bit computers, and it is defined like this:
 918
 919 <tscreen><verb>
 920         FILES {
 921             %O: format = atari;
 922         }
 923 </verb></tscreen>
 924
 925 In the Atari segmented file format, the linker will write each <tt/MEMORY/ area
 926 as a new segment, including a header with the start and end address.
 927
 928 The necessary o65 or Atari attributes are defined in a special section labeled
 929 <ref id="FORMAT" name="FORMAT">.
 930
 931
 932
 933 <sect1>The FORMAT section<label id="FORMAT"><p>
 934
 935 The <tt/FORMAT/ section is used to describe file formats. The default (binary)
 936 format has currently no attributes, so, while it may be listed in this
 937 section, the attribute list is empty. The second supported format,
 938 <url url="http://www.6502.org/users/andre/o65/fileformat.html" name="o65">,
 939 has several attributes that may be defined here.
 940
 941 <tscreen><verb>
 942     FORMATS {
 943         o65: os = lunix, version = 0, type = small,
 944              import = LUNIXKERNEL,
 945              export = _main;
 946     }
 947 </verb></tscreen>
 948
 949 The Atari file format has two attributes:
 950
 951 <descrip>
 952
 953   <tag><tt>RUNAD = symbol</tt></tag>
 954
 955   Specify a symbol as the run address of the binary, the loader will call this
 956   address after all the file is loaded in memory. If the attribute is omitted,
 957   no run address is included in the file.
 958
 959   <tag><tt>INITAD = memory_area : symbol</tt></tag>
 960
 961   Specify a symbol as the initialization address for the given memory area.
 962   The binary loader will call this address just after the memory area is loaded
 963   into memory, before continuing loading the rest of the file.
 964
 965 </descrip>
 966
 967
 968 <tscreen><verb>
 969     FORMATS {
 970         atari: runad = _start;
 971     }
 972 </verb></tscreen>
 973
 974
 975 <sect1>The FEATURES section<label id="FEATURES"><p>
 976
 977 In addition to the <tt/MEMORY/ and <tt/SEGMENTS/ sections described above, the
 978 linker has features that may be enabled by an additional section labeled
 979 <tt/FEATURES/.
 980
 981
 982 <sect2>The CONDES feature<p>
 983
 984 <tt/CONDES/ is used to tell the linker to emit module constructor/destructor
 985 tables.
 986
 987 <tscreen><verb>
 988         FEATURES {
 989             CONDES: segment = RODATA,
 990                     type = constructor,
 991                     label = __CONSTRUCTOR_TABLE__,
 992                     count = __CONSTRUCTOR_COUNT__;
 993         }
 994 </verb></tscreen>
 995
 996 The <tt/CONDES/ feature has several attributes:
 997
 998 <descrip>
 999
1000   <tag><tt>segment</tt></tag>
1001
1002   This attribute tells the linker into which segment the table should be
1003   placed. If the segment does not exist, it is created.
1004
1005
1006   <tag><tt>type</tt></tag>
1007
1008   Describes the type of the routines to place in the table. Type may be one of
1009   the predefined types <tt/constructor/, <tt/destructor/, <tt/interruptor/, or
1010   a numeric value between 0 and 6.
1011
1012
1013   <tag><tt>label</tt></tag>
1014
1015   This specifies the label to use for the table. The label points to the start
1016   of the table in memory and may be used from within user-written code.
1017
1018
1019   <tag><tt>count</tt></tag>
1020
1021   This is an optional attribute. If specified, an additional symbol is defined
1022   by the linker using the given name. The value of this symbol is the number
1023   of entries (<em/not/ bytes) in the table. While this attribute is optional,
1024   it is often useful to define it.
1025
1026
1027   <tag><tt>order</tt></tag>
1028
1029   An optional attribute that takes one of the keywords <tt/increasing/ or
1030   <tt/decreasing/ as an argument. Specifies the sorting order of the entries
1031   within the table. The default is <tt/increasing/, which means that the
1032   entries are sorted with increasing priority (the first entry has the lowest
1033   priority). "Priority" is the priority specified when declaring a symbol as
1034   <tt/.CONDES/ with the assembler, higher values mean higher priority. You may
1035   change this behaviour by specifying <tt/decreasing/ as the argument, the
1036   order of entries is reversed in this case.
1037
1038   Please note that the order of entries with equal priority is undefined.
1039
1040   <tag><tt>import</tt></tag>
1041
1042   This attribute defines a valid symbol name, that is added as an import
1043   to the modules defining a constructor/destructor of the given type.
1044   This can be used to force linkage of a module if this module exports the
1045   requested symbol.
1046
1047 </descrip>
1048
1049 Without specifying the <tt/CONDES/ feature, the linker will not create any
1050 tables, even if there are <tt/condes/ entries in the object files.
1051
1052 For more information see the <tt/.CONDES/ command in the <url
1053 url="ca65.html" name="ca65 manual">.
1054
1055
1056 <sect2>The STARTADDRESS feature<p>
1057
1058 <tt/STARTADDRESS/ is used to set the default value for the start address,
1059 which can be referenced by the <tt/%S/ symbol. The builtin default for the
1060 linker is &dollar;200.
1061
1062 <tscreen><verb>
1063         FEATURES {
1064             # Default start address is $1000
1065             STARTADDRESS:       default = $1000;
1066         }
1067 </verb></tscreen>
1068
1069 Please note that order is important: The default start address must be defined
1070 <em/before/ the <tt/%S/ symbol is used in the config file. This does usually
1071 mean, that the <tt/FEATURES/ section has to go to the top of the config file.
1072
1073
1074
1075 <sect1>The SYMBOLS section<label id="SYMBOLS"><p>
1076
1077 The configuration file may also be used to define symbols used in the link
1078 stage or to force symbols imports. This is done in the SYMBOLS section. The
1079 symbol name is followed by a colon and symbol attributes.
1080
1081 The following symbol attributes are supported:
1082
1083 <descrip>
1084
1085   <tag><tt>addrsize</tt></tag>
1086
1087   The <tt/addrsize/ attribute specifies the address size of the symbol and
1088   may be one of
1089 <itemize>
1090     <item><tt/zp/, <tt/zeropage/ or <tt/direct/
1091     <item><tt/abs/, <tt/absolute/ or <tt/near/
1092     <item><tt/far/
1093     <item><tt/long/ or <tt/dword/.
1094 </itemize>
1095
1096 Without this attribute, the default address size is <tt/abs/.
1097
1098   <tag><tt>type</tt></tag>
1099
1100   This attribute is mandatory. Its value is one of <tt/export/, <tt/import/ or
1101   <tt/weak/. <tt/export/ means that the symbol is defined and exported from
1102   the linker config. <tt/import/ means that an import is generated for this
1103   symbol, eventually forcing a module that exports this symbol to be included
1104   in the output. <tt/weak/ is similar as <tt/export/. However, the symbol is
1105   only defined if it is not defined elsewhere.
1106
1107   <tag><tt>value</tt></tag>
1108
1109   This must only be given for symbols of type <tt/export/ or <tt/weak/. It
1110   defines the value of the symbol and may be an expression.
1111
1112 </descrip>
1113
1114 The following example defines the stack size for an application, but allows
1115 the programmer to override the value by specifying <tt/--define
1116 __STACKSIZE__=xxx/ on the command line.
1117
1118 <tscreen><verb>
1119         SYMBOLS {
1120             # Define the stack size for the application
1121             __STACKSIZE__:  type = weak, value = $800;
1122         }
1123 </verb></tscreen>
1124
1125
1126
1127 <sect>Special segments<p>
1128
1129 The builtin config files do contain segments that have a special meaning for
1130 the compiler and the libraries that come with it. If you replace the builtin
1131 config files, you will need the following information.
1132
1133 <sect1>ONCE<p>
1134
1135 The ONCE segment is used for initialization code run only once before
1136 execution reaches main() - provided that the program runs in RAM. You
1137 may for example add the ONCE segment to the heap in really memory
1138 constrained systems.
1139
1140 <sect1>LOWCODE<p>
1141
1142 For the LOWCODE segment, it is guaranteed that it won't be banked out, so it
1143 is reachable at any time by interrupt handlers or similar.
1144
1145 <sect1>STARTUP<p>
1146
1147 This segment contains the startup code which initializes the C software stack
1148 and the libraries. It is placed in its own segment because it needs to be
1149 loaded at the lowest possible program address on several platforms.
1150
1151 <sect1>ZPSAVE<p>
1152
1153 The ZPSAVE segment contains the original values of the zeropage locations used
1154 by the ZEROPAGE segment. It is placed in its own segment because it must not be
1155 initialized.
1156
1157
1158
1159 <sect>Copyright<p>
1160
1161 ld65 (and all cc65 binutils) are (C) Copyright 1998-2005 Ullrich von
1162 Bassewitz. For usage of the binaries and/or sources the following
1163 conditions do apply:
1164
1165 This software is provided 'as-is', without any expressed or implied
1166 warranty.  In no event will the authors be held liable for any damages
1167 arising from the use of this software.
1168
1169 Permission is granted to anyone to use this software for any purpose,
1170 including commercial applications, and to alter it and redistribute it
1171 freely, subject to the following restrictions:
1172
1173 <enum>
1174 <item>  The origin of this software must not be misrepresented; you must not
1175         claim that you wrote the original software. If you use this software
1176         in a product, an acknowledgment in the product documentation would be
1177         appreciated but is not required.
1178 <item>  Altered source versions must be plainly marked as such, and must not
1179         be misrepresented as being the original software.
1180 <item>  This notice may not be removed or altered from any source
1181         distribution.
1182 </enum>
1183
1184
1185
1186 </article>