File types:
- ProDOS OBJ, LIB, S16, RTL, EXE, PIF, TIF, NDA, CDA, TOL, DVR, LDF, FST
Primary references:
- ORCA/M: A Macro Assembler for the Apple IIgs (ORCA/M 2.0 manual); appendix B documents OMF v0, v1, and v2.1 files (see p.488)
- Apple IIgs Programmer's Workshop Reference; chapter 7 describes OMF v1.0 and v2.0 (see p.228)
- Apple IIgs GS/OS Reference, for GS/OS System Software Version 5.0 and later; Appendix F describes OMF v2.1, and Chapter 8 has some useful information about how the loader works (e.g. p.205)
- "Undocumented Secrets of the Apple IIGS System Loader" web page by Neil Parker, http://nparker.llx.com/a2/loader.html; documents ExpressLoad segments and other arcana
- 6502bench SourceGen OMF converter; acts as a loader to prepare load files for disassembly: https://github.com/fadden/6502bench/blob/master/SourceGen/Tools/Omf/
Object Module Format defines a way to package executable code and data. It was developed by ByteWorks for its assembler and compiler products (e.g. ORCA/M), and licensed by Apple for GS/OS and the Apple IIgs Programmer's Workshop (APW). OMF can be used for four kinds of files:
- Object files. The output of an assembler or compiler is stored in these. They may contain unresolved references, and must be processed by a linker before they can be run. (ProDOS type OBJ.)
- Library files. These contain a number of pieces that are extracted by the linker and added to an object file while it is built. (ProDOS type LIB.)
- Load files. These are executable programs generated by a linker, from object files and library files. The system loader will read them into memory and process relocations. (ProDOS types S16, EXE, PIF, TIF, NDA, CDA, TOL, DVR, LDF, FST.)
- Run-time library files. Defined by the OMF specification but not implemented by the loader? (ProDOS type RTL.)
OMF files don't have an indication of which kind they are. They are simply a series of segments, and it's up to the program reading them to decide how they should be handled. OMF v1 defines a key field differently for load files and library files (BLKCNT / BYTECNT), so it's not always possible to parse OMF correctly without knowing the ProDOS file type (or making educated guesses).
There are four versions of OMF:
- v0.0: original 8-bit Orca/M format. 0x24 bytes followed by variable-length SEGNAME.
- v1.0: initial Apple IIgs format. Adds LCBANK, SEGNUM, ENTRY, DISPNAME, DISPDATA, and LOADNAME. Introduces BLKCNT/BYTECNT ambiguity.
- v2.0: updated Apple IIgs format. Removes LCBANK, redefines TYPE/KIND, and removes parsing ambiguity by embracing BYTECNT.
- v2.1: adds TEMPORG and some attribute flags.
Each file is a series of segments. Each segment has a header and a body; the body is a series of records.
The original "version 0" header, used by older 8-bit ORCA/M, is:
+$00 / 4: BLOCKCCOUNT: length of segment in file, in 512-byte blocks
+$04 / 4: RESSPC: number of bytes of zeroes to add to the end of the segment
+$08 / 4: LENGTH: memory size required by segment when loaded; includes RESSPC
+$0c / 1: TYPE: enumerations and flags; bits 0-4 determine the segment type (code, data, etc.)
+$0d / 1: LABLEN: length of names in label records, or 0 if strings are prefixed with length
+$0e / 1: NUMLEN: length of numbers in segment body (must be 4)
+$0f / 1: VERSION: OMF version (0 for v0.0)
+$10 / 4: BANKSIZE: maximum memory bank size for segment
+$14 / 4: ORG: absolute address at which segment is to be loaded
+$18 / 4: ALIGN: boundary on which this segment must be aligned
+$1c / 1: NUMSEX: order of bytes, 0 for little-endian, 1 for big-endian (must be 0)
+$1d / 7: (reserved)
+$24 / nn: SEGNAME: segment name; string length specified by LABLEN
The v1.0 header is:
+$00 / 4: BLKCNT / BYTECNT: length of segment in file, either as 512-byte blocks or bytes
+$04 / 4: RESSPC: number of bytes of zeroes to add to the end of the segment
+$08 / 4: LENGTH: memory size required by segment when loaded; includes RESSPC
+$0c / 1: TYPE: enumerations and flags; bits 0-4 determine the segment type (code, data, etc.)
+$0d / 1: LABLEN: length of names in label records, or 0 if strings are prefixed with length
+$0e / 1: NUMLEN: length of numbers in segment body (must be 4)
+$0f / 1: VERSION: OMF version (1 for v1.0)
+$10 / 4: BANKSIZE: maximum memory bank size for segment
+$14 / 4: (reserved)
+$18 / 4: ORG: absolute address at which segment is to be loaded
+$1c / 4: ALIGN: boundary on which this segment must be aligned
+$20 / 1: NUMSEX: order of bytes, 0 for little-endian, 1 for big-endian (must be 0)
+$21 / 1: LCBANK: indicates the language card bank into which the segment should be loaded
+$22 / 2: SEGNUM: segment number
+$24 / 4: ENTRY: offset into the segment of the entry point
+$28 / 2: DISPNAME: displacement (offset) of the LOADNAME field
+$2a / 2: DISPDATA: displacement (offset) of the segment body
+DISPNAME / 10: LOADNAME: target segment for linker; padded with spaces to fill out 10 bytes
++$0a / nn: SEGNAME: segment name; string length specified by LABLEN
The TYPE field is mis-labeled KIND in early APW documentation.
The v2.0 header is:
+$00 / 4: BYTECNT: number of bytes in the file that the segment requires, including the header
+$04 / 4: RESSPC: number of bytes of zeroes to add to the end of the segment
+$08 / 4: LENGTH: memory size required by segment when loaded; includes RESSPC
+$0c / 1: (reserved)
+$0d / 1: LABLEN: length of names in label records, or 0 if strings are prefixed with length
+$0e / 1: NUMLEN: length of numbers in segment body (must be 4)
+$0f / 1: VERSION: OMF version (2 for v2.x)
+$10 / 4: BANKSIZE: maximum memory bank size for segment
+$14 / 2: KIND: enumerations and flags that define segment attributes
+$16 / 2: (reserved)
+$18 / 4: ORG: absolute address at which segment is to be loaded
+$1c / 4: ALIGN: boundary on which this segment must be aligned
+$20 / 1: NUMSEX: order of bytes, 0 for little-endian, 1 for big-endian (must be 0)
+$21 / 1: (reserved)
+$22 / 2: SEGNUM: segment number
+$24 / 4: ENTRY: offset into the segment of the entry point
+$28 / 2: DISPNAME: displacement (offset) of the LOADNAME field
+$2a / 2: DISPDATA: displacement (offset) of the segment body
+DISPNAME / 10: LOADNAME: target segment for linker; padded with spaces to fill out 10 bytes
++$0a / nn: SEGNAME: segment name; string length specified by LABLEN
Version 2.1 adds a 4-byte field at +$2A called "tempOrg", and defines some new KIND flags. The "temporary origin" field is used by the MPW IIgs cross-assembler.
Versions 0 and 1 aligned segments at 512-byte boundaries. The file length was generally also a multiple of 512.
The GS/OS reference declares that an OMF file is "foreign" unless:
- NUMSEX field is 0
- NUMLEN field is 4
- BANKSIZE field is <= $10000
- ALIGN field is <= $10000
There are a couple of bugs in the GS/OS documentation:
- GS/OS ref: table F-2 says "blockCount" where it should say "SEGNAME", and shows the offset of "tempOrg" as $2a (should be $2c).
- GS/OS ref: appendix F refers to a "REVISION" field, which does not seem to exist.
The defined segment types, specified by the low 5 bits of the TYPE/KIND field, are:
$00: code
$01: data
$02: jump table
$04: pathname
$08: library dictionary
$10: initialization
$11: absolute bank (v1.0 only; became a flag in v2.0)
$12: DP/Stack
The segment body is a series of records. Records start with an opcode byte, which can fall into three categories:
- $00: indicates the end of the segment
- $01-df: count of the number of bytes that follow; these are copied directly at load time
- $e0-ff: directives
Directives can be constants, relocation records, expressions, and so on.
ExpressLoad was introduced in System 5 to reduce the load time of executable binaries. It provides an index into the segments that reduces parsing. It can only be used with OMF v2.