- Apple II ROM tape routines (READ, WRITE)
- CiderPress implementation (https://github.com/fadden/ciderpress/blob/master/app/CassetteDialog.cpp)
There is a substantial library of Apple II cassette tape recordings at https://brutaldeluxe.fr/projects/cassettes/.
The typical storage medium used on early Apple II systems, before floppy drives became commonplace, was a consumer-grade cassette tape recorder. The Apple II records data as a frequency-modulated sine wave. A recording device can be connected to the dedicated audio I/O port on the Apple ][, ][+, and //e. The //c, ///, and IIgs do not have this port.
A tape can hold one or more chunks of data, each of which has the following structure:
- Entry tone: 10.6 seconds of 770Hz (8192 cycles at 1300 usec/cycle). This lets the human operator know that the start of data had been found.
- Tape-in edge: 1/2 cycle at 400 usec/cycle, followed by 1/2 cycle at 500 usec/cycle. This "short zero" indicates the transition between header and data.
- Data: one cycle per bit, using 500 usec/cycle (2kHz) for 0 and 1000 usec/cycle (1kHz) for 1.
There is no "end of data" indication, so it's up to the user to specify the length of data. The last byte of data is followed by an XOR checksum, initialized to $FF, that can be used to check for success.
Typical instructions for loading data from tape look like this:
- Type
LOAD
orxxxx.xxxxR
, but don't hit <return>. - Play tape until you here the tone.
- Immediately hit stop.
- Plug the cable from the Apple II into the tape player.
- Hit "play" on the recorder, then immediately hit <return>.
- When the Apple II beeps, it's done. Stop the tape.
The cassette I/O routines are built into the ROM. For binary code or data,
the command is issued from the system monitor. The length is specified on
the monitor command line, e.g. 800.1FFFR
would read $1800 (6144) bytes. For
BASIC programs and data, the length is included in an initial header section:
- Integer BASIC programs have a two-byte (little-endian) length.
- Applesoft BASIC programs (written with
SAVE
, loaded withLOAD
) have a three byte header: the two-byte length, followed by a "lock" flag byte (if the high bit is set, the program starts executing when theLOAD
completes). - Applesoft shape tables (loaded with
SHLOAD
) have a two-byte length. - Applesoft arrays (written with
STORE
, loaded withRECALL
) have a three-byte header: the length, followed by an unused byte.
The header section is a full data area, complete with 10.6-second lead-in.
The storage density varies from 2000bps for a file full of 0 bits to 1000bps for a file full of 1 bits. Assuming an equal distribution of bits, you can expect to transfer about 187 bytes/second (ignoring the header).
Some notes on what's required to decipher the recorded signal...
The monitor ROM routine uses a detection threshold of 700 usec to tell the difference between 0s and 1s. When reading, it outputs a tone for 3.5 seconds before listening. It doesn't try to detect the 770Hz tone, just waits for something under (40*12=)440 usec.
The Apple II hardware changes the high bit read from $c060 every time it detects a zero-crossing on the cassette input. I assume the polarity of the input signal is reflected by the polarity of the high bit, but I'm not sure, and in the end it doesn't really matter.
How quickly do we need to sample? The highest frequency we expect to find is 2KHz, so anything over 4KHz should be sufficient. However, we need to be able to resolve the time between zero transitions to some reasonable resolution. We need to tell the difference between a 650usec half-cycle and a 200usec half-cycle for the start, and 250/500usec for the data section. Our measurements can comfortably be off by 200 usec with no ill effects on the lead-in, assuming a perfect signal. (Sampling every 200 usec would be 5Hz.) The data itself needs to be +/- 125usec for half-cycles, though we can get a little sloppier if we average the error out by combining half-cycles.
The signal is less than perfect, sometimes far less, so we need better sampling to avoid magnifying distortions in the signal. If we sample at 22.05KHz, we could see a 650usec gap as 590, 635, or 680, depending on when we sample and where we think the peaks lie. We're off by 15usec before we even start. We can reasonably expect to be off +/- twice the "usecPerSample" value. At 8KHz, that's +/- 250usec, which isn't acceptable. At 11KHz we're at +/- 191usec, which is scraping along.
We can get mitigate some problems by doing an interpolation of the two points nearest the zero-crossing, which should give us a more accurate fix on the zero point than simply choosing the closest point. This does potentially increase our risk of errors due to noise spikes at points near the zero. Since we're reading from cassette, any noise spikes are likely to be pretty wide, so averaging the data or interpolating across multiple points isn't likely to help us.
Some tapes seem to have a low-frequency distortion that amounts to a DC bias when examining a single sample. Timing the gaps between zero crossings is therefore not sufficient unless we also correct for the local DC bias. In some cases the recorder or media was unable to respond quickly enough, and as a result 0s have less amplitude than 1s. This throws off some simple correction schemes.
The easiest approach is to figure out where one cycle starts and stops, and use the timing of the full cycle. This gets a little ugly because the original output was a square wave, so there's a bit of ringing in the peaks, especially the 1s. Of course, we have to look at half-cycles initially, because we need to identify the first "short 0" part. Once we have that, we can use full cycles, which distributes any error over a larger set of samples.
In some cases the positive half-cycle is longer than the negative half-cycle (e.g. reliably 33 samples vs. 29 samples at 48KHz, when 31.2 is expected for 650us). Slight variations can lead to even greater distortion, even though the timing for the full signal is within tolerances. This means we need to accumulate the timing for a full cycle before making an evaluation, though we still need to examine the half-cycle timing during the lead-in to catch the "short 0".
Because of these distortions, 8-bit 8KHz audio is probably not a good idea. 22.05KHz sampling is a better choice for tapes that have been sitting around for 25-30 years.
Here are some tips on capturing audio from an Apple II data recording. This comes from the manual for the original CiderPress application.
It isn't necessary to record each section of data from the cassette into its own WAV file. CiderPress will try to find every chunk of data in a WAV file.
If you have a "line out" on your tape player and a "line in" on your PC sound card, use those. If not, you can use the "microphone" input and the "headphone" out, though you will have to set the volume levels correctly. In the "speaker" or "multimedia" control panel, set the microphone input gain to 50%. Start up your sound editor. If you have a way to see the input level on the microphone, turn that on. Play the cassette tape out loud until you hear a tone, then plug it into the computer and watch the input level. You want to set the volume so that the input is as high as you can get it without exceeding the limit (this causes "clipping", which is a lot like a square wave but probably isn't going to help us here).
Once you have the volume level figured out, back the tape up to the start of the tone. Hit "record" in your software and "play" on your tape player. Record at 22.05KHz with 8-bit monaural samples. (Recording at CD quality, 44.1KHz with 16-bit stereo samples, doesn't help and requires 8x the space.) If your software shows an input meter while recording, continue to record until the volume level drops and stays low for at least 10 seconds. If you can't monitor the input, you will either need to time the cassette, or just record for a long time and perhaps trim the excess off in the sound editor. Make sure you get all of the data from the tape. When you think you're done, pull the audio plug out of the tape player and keep listening for a little bit.
Tip: CiderPress only needs to see about a second of the lead-in, so it's okay to fiddle with the volume while the initial tone is playing.
Tip: in some cases, setting the volume a little too high can be beneficial. It's better to clip some samples than have too little signal. If at first you don't succeed, crank up the volume a notch and try again.
Most cassettes include more than one copy of a program. In some cases (such as Adventure International's "Asteroid") they are slightly different implementations, while in others it's the same program repeated. Sometimes the program is repeated on the back side of the tape. Magnetic tapes wear out if you play them too much, so redundancy was common.
The output of the Apple II is a blocky "square wave" rather than a smooth "sine wave". Because of limitations in how quickly voltage levels can change, the output isn't perfectly square. Because of the physical properties of and variations in magnetic media, the not-quite-square wave is rather rounded and wiggly. After being stored in less-than-perfect conditions for 25-30 years, what you read back from an Apple II tape can look pretty crazy.
Most cassettes can be recovered, even those that will no longer play on an Apple II. If you find one that can't, you may want to keep the WAV recording anyway, on the off chance that in the future an improved algorithm can be developed that will decode it.
When CiderPress encounters data that it can't interpret, it stops trying to read from that section of the WAV file. For this reason, damaged entries will usually be shorter than undamaged ones. If a file appears to have the correct length but the checksum still doesn't match, it means the signal was sufficiently distorted to make a '0' bit look like a '1' bit, which is actually pretty hard to do. In most cases the decoder will either make an accurate determination or will conclude that the signal is too distorted to process. So far only one case has been found where the checksum was deliberately altered, as part of a copy protection scheme (Sargon II).
If the tape has more than one program on it, you can usually tell if it's multiple copies of the same thing by comparing lengths and checksums. If the checksums say "good" but have different values, you probably have two different programs, or two slightly different versions of the same program.
You may be tempted to store copies of the WAV file in MP3 format. This is not recommended. CiderPress cannot decode MP3s, and the decoded MP3 file is less likely to work than the original. However, experiments with converting the sound files in and out of MP3 format suggest that "healthy" files are unharmed at reasonable compression ratios.
You don't need fancy equipment. Connecting the headphone jack of a 15-year-old "boom box" to the microphone jack of a low-cost PC with on-motherboard audio works just fine.
For reference, here is a disassembly of the ROM routines. The code has been rearranged to make it easier to read.
; Increment 16-bit value at 0x3c (A1) and compare it to 16-bit value at
; 0x3e (A2). Returns with carry set if A1 >= A2.
; Requires 26 cycles in common case, 30 cycles in rare case.
FCBA: A5 3C 709 NXTA1 LDA A1L ;INCR 2-BYTE A1.
FCBC: C5 3E 710 CMP A2L
FCBE: A5 3D 711 LDA A1H ; AND COMPARE TO A2
FCC0: E5 3F 712 SBC A2H
FCC2: E6 3C 713 INC A1L ; (CARRY SET IF >=)
FCC4: D0 02 714 BNE RTS4B
FCC6: E6 3D 715 INC A1H
FCC8: 60 716 RTS4B RTS
; Write data from location in A1L up to location in A2L.
FECD: A9 40 975 WRITE LDA #$40
FECF: 20 C9 FC 976 JSR HEADR ;WRITE 10-SEC HEADER
; Write loop. Continue until A1 reaches A2.
FED2: A0 27 977 LDY #$27
FED4: A2 00 978 WR1 LDX #$00
FED6: 41 3C 979 EOR (A1L,X)
FED8: 48 980 PHA
FED9: A1 3C 981 LDA (A1L,X)
FEDB: 20 ED FE 982 JSR WRBYTE
FEDE: 20 BA FC 983 JSR NXTA1
FEE1: A0 1D 984 LDY #$1D
FEE3: 68 985 PLA
FEE4: 90 EE 986 BCC WR1
; Write checksum byte, then beep the speaker.
FEE6: A0 22 987 LDY #$22
FEE8: 20 ED FE 988 JSR WRBYTE
FEEB: F0 4D 989 BEQ BELL
; Write one byte (8 bits, or 16 half-cycles).
; On exit, Z-flag is set.
FEED: A2 10 990 WRBYTE LDX #$10
FEEF: 0A 991 WRBYT2 ASL
FEF0: 20 D6 FC 992 JSR WRBIT
FEF3: D0 FA 993 BNE WRBYT2
FEF5: 60 994 RTS
; Write tape header. Called by WRITE with A=$40, READ with A=$16.
; On exit, A holds $FF.
; First time through, X is undefined, so we may get slightly less than
; A*256 half-cycles (i.e. A*255 + X). If the carry is clear on entry,
; the first ADC will subtract two (yielding A*254+X), and the first X
; cycles will be "long 0s" instead of "long 1s". Doesn't really matter.
FCC9: A0 4B 717 HEADR LDY #$4B ;WRITE A*256 'LONG 1'
FCCB: 20 DB FC 718 JSR ZERDLY ; HALF CYCLES
FCCE: D0 F9 719 BNE HEADR ; (650 USEC EACH)
FCD0: 69 FE 720 ADC #$FE
FCD2: B0 F5 721 BCS HEADR ;THEN A 'SHORT 0'
; Fall through to write bit. Note carry is clear, so we'll use the zero
; delay. We've initialized Y to $21 instead of $32 to get a short '0'
; (165usec) for the first half and a normal '0' for the second half;
FCD4: A0 21 722 LDY #$21 ; (400 USEC)
; Write one bit. Called from WRITE with Y=$27.
FCD6: 20 DB FC 723 WRBIT JSR ZERDLY ;WRITE TWO HALF CYCLES
FCD9: C8 724 INY ; OF 250 USEC ('0')
FCDA: C8 725 INY ; OR 500 USEC ('0')
; Delay for '0'. X typically holds a bit count or half-cycle count.
; Y holds delay period in 5-usec increments:
; (carry clear) $21=165us $27=195us $2C=220 $4B=375us
; (carry set) $21=165+250=415us $27=195+250=445us $4B=375+250=625us
; Remember that TOTAL delay, with all other instructions, must equal target
; On exit, Y=$2C, Z-flag is set if X decremented to zero. The 2C in Y
; is for WRBYTE, which is in a tight loop and doesn't need much padding.
FCDB: 88 726 ZERDLY DEY
FCDC: D0 FD 727 BNE ZERDLY
FCDE: 90 05 728 BCC WRTAPE ;Y IS COUNT FOR
; Additional delay for '1' (always 250us).
FCE0: A0 32 729 LDY #$32 ; TIMING LOOP
FCE2: 88 730 ONEDLY DEY
FCE3: D0 FD 731 BNE ONEDLY
; Write a transition to the tape.
FCE5: AC 20 C0 732 WRTAPE LDY TAPEOUT
FCE8: A0 2C 733 LDY #$2C
FCEA: CA 734 DEX
FCEB: 60 735 RTS
; Read data from location in A1L up to location in A2L.
FEFD: 20 FA FC 999 READ JSR RD2BIT ;FIND TAPEIN EDGE
FF00: A9 16 1000 LDA #$16
FF02: 20 C9 FC 1001 JSR HEADR ;DELAY 3.5 SECONDS
FF05: 85 2E 1002 STA CHKSUM ;INIT CHKSUM=$FF
FF07: 20 FA FC 1003 JSR RD2BIT ;FIND TAPEIN EDGE
; Loop, waiting for edge. 11 cycles/iteration, plus 432+14 = 457usec.
FF0A: A0 24 1004 RD2 LDY #$24 ;LOOK FOR SYNC BIT
FF0C: 20 FD FC 1005 JSR RDBIT ; (SHORT 0)
FF0F: B0 F9 1006 BCS RD2 ; LOOP UNTIL FOUND
; Timing of next transition, a normal '0' half-cycle, doesn't matter.
FF11: 20 FD FC 1007 JSR RDBIT ;SKIP SECOND SYNC H-CYCLE
; Main byte read loop. Continue until A1 reaches A2.
FF14: A0 3B 1008 LDY #$3B ;INDEX FOR 0/1 TEST
FF16: 20 EC FC 1009 RD3 JSR RDBYTE ;READ A BYTE
FF19: 81 3C 1010 STA (A1L,X) ;STORE AT (A1)
FF1B: 45 2E 1011 EOR CHKSUM
FF1D: 85 2E 1012 STA CHKSUM ;UPDATE RUNNING CHKSUM
FF1F: 20 BA FC 1013 JSR NXTA1 ;INC A1, COMPARE TO A2
FF22: A0 35 1014 LDY #$35 ;COMPENSATE 0/1 INDEX
FF24: 90 F0 1015 BCC RD3 ;LOOP UNTIL DONE
; Read checksum byte and check it.
FF26: 20 EC FC 1016 JSR RDBYTE ;READ CHKSUM BYTE
FF29: C5 2E 1017 CMP CHKSUM
FF2B: F0 0D 1018 BEQ BELL ;GOOD, SOUND BELL AND RETURN
; Print "ERR", beep speaker.
FF2D: A9 C5 1019 PRERR LDA #$C5
FF2F: 20 ED FD 1020 JSR COUT ;PRINT "ERR", THEN BELL
FF32: A9 D2 1021 LDA #$D2
FF34: 20 ED FD 1022 JSR COUT
FF37: 20 ED FD 1023 JSR COUT
FF3A: A9 87 1024 BELL LDA #$87 ;OUTPUT BELL AND RETURN
FF3C: 4C ED FD 1025 JMP COUT
; Read a byte from the tape. Y is $3B on first call, $35 on subsequent
; calls. The bits are shifted left, meaning that the high bit is read
; first.
FCEC: A2 08 736 RDBYTE LDX #$08 ;8 BITS TO READ
FCEE: 48 737 RDBYT2 PHA ;READ TWO TRANSITIONS
FCEF: 20 FA FC 738 JSR RD2BIT ; (FIND EDGE)
FCF2: 68 739 PLA
FCF3: 2A 740 ROL ;NEXT BIT
FCF4: A0 3A 741 LDY #$3A ;COUNT FOR SAMPLES
FCF6: CA 742 DEX
FCF7: D0 F5 743 BNE RDBYT2
FCF9: 60 744 RTS
; Read two bits from the tape.
FCFA: 20 FD FC 745 RD2BIT JSR RDBIT
; Read one bit from the tape. On entry, Y is the expected transition time:
; $3A=696usec $35=636usec $24=432usec
; Returns with the carry set if the transition time exceeds the Y value.
FCFD: 88 746 RDBIT DEY ;DECR Y UNTIL
FCFE: AD 60 C0 747 LDA TAPEIN ; TAPE TRANSITION
FD01: 45 2F 748 EOR LASTIN
FD03: 10 F8 749 BPL RDBIT
; the above loop takes 12 usec per iteration, what follows takes 14.
FD05: 45 2F 750 EOR LASTIN
FD07: 85 2F 751 STA LASTIN
FD09: C0 80 752 CPY #$80 ;SET CARRY ON Y
FD0B: 60 753 RTS