Disk Images and File Archives
A "disk image" is a file containing the contents of a floppy disk, hard drive, CD-ROM or other physical storage medium. Disk images were typically created on vintage systems so that the disk contents could be sent electronically. The images often, but not always, have files arranged in a filesystem such as ProDOS or HFS.
A "file archive" is a file that contains a collection of other files. Minimizing the size of the archive is often important, so the contents of file archives are usually packed tightly.
(Sometimes this documentation will refer to disk images and file archives collectively as "archives" for the sake of brevity.)
The nature of a filesystem allows changes, such as adding or deleting a file, to be made with little to no disruption to existing data. In contrast, file archives may require a significant amount of data to be reshuffled for something as simple as renaming a file.
CiderPress II handles disk images and archives differently. Changes to disk images are made directly, as if the image were a mounted filesystem. Changes to file archives are made by creating an entirely new archive file, and renaming it in place of the original when all changes have been completed. This changes how things behave when errors are encountered. For example, if you try to add three files to a disk image, and cancel the process before adding the third file, the first two files will be present afterward. For a file archive, canceling the operation partway through would leave the archive entirely untouched.
One important consequence of this structure is that, while killing the application mid-operation on a file archive will at worst leave you with a stray temporary file, doing so mid-operation on a disk image could result in disk corruption (just as it would if you turned off a computer mid-write to a disk). The application flushes all changes before returning to an idle state, so everything is safe once an operation has completed. Additional steps are taken to ensure that all data is in a safe state whenever an operation is paused to ask for input, e.g. when obtaining permission to overwrite a file.
It's possible to store disk images and file archives inside other disk images and file archives (a "turducken"). For example, a ShrinkIt archive could be stored on a ProDOS filesystem in a WOZ disk image in a ZIP archive. CiderPress II fully supports nested archives, allowing direct access to the files at any level.
It's important to note that, because of the way that modifications to file archives are handled, it's necessary to have enough space to hold both the old archive and the new archive. When the file archive is on the host system that's rarely an issue, but when the file archive is in a small disk image it's easy to run out of space.
Add/Extract vs. Import/Export
There are four distinct operations for adding and extracting files:
- Extract: extract a file from an archive without modification. Attempt to preserve file attributes.
- Add: add a file to an archive without modification. Attempt to restore file attributes from saved metadata.
- Export: extract a file from an archive, converting it to something new. This could be a simple adjustment to a text file, or a conversion from Apple II hi-res to PNG.
- Import: add a file to an archive, converting its format. For example, the end-of-line markers in a text file might be changed from CRLF to CR, or an Applesoft BASIC program could be converted from a text file to tokenized form.
Utilities such as NuLib2 and the original CiderPress blend the operations together, which can lead to some ambiguous behavior. In CiderPress II, add/extract are always distinctly different operations from import/export.
Changes from CiderPress
The original CiderPress, first published in 2003, is a Windows-only application that can be run on other platforms with the use of the Wine emulation wrapper. The code was written in C++, using the Windows MFC toolkit. Some of the lower-level functions were implemented in portable libraries that were used by other applications.
CiderPress II is written in C#, targeted at .NET Core 6. It gives equal importance to GUI and command-line interfaces, and can run on a variety of systems, including Windows, Mac OS, and Linux.
In addition to significant new features like a command-line interface, drag & drop file management, and WOZ disk image support, there are a few more subtle changes:
- File archives and disk images nested inside other file archives and disk images can be accessed directly.
- When files are extracted, the resource fork and extended attributes can be preserved in multiple ways: AppleSingle, AppleDouble, NAPS (NuLib2 Attribute Preservation Strings), or using host filesystem features (Mac OS / HFS+ only). These are handled transparently when adding files.
- DOS T/I/A/B files can be opened in "raw" mode.
- Files may be copied directly between volumes. For DOS files this can preserve the sparse structure of random-access text files.
- AppleSingle and AppleDouble are integrated into add/extract. In the original, AppleSingle was treated as a read-only archive.
- DOS hybrid (e.g. DOS + ProDOS on a single disk) support has been added, and the handling of DOS.MASTER embedded volumes has been greatly improved.
- HFS file type support has been generalized. ProDOS and HFS types can be set independently in places where both are present (NuFX archives, ProDOS extended files).
- Errors and warnings generated by lower-level code, such as filesystem implementations, are now presented to the user as "notes".
A few things have been removed and are not expected to return, due to lack of interest:
- NuFX archives with deflate, bzip2, and LZC compression are no longer supported.
- The FDI disk image format has been dropped.
- SST file combining has been dropped.
Under the hood there are many significant changes, such as:
- NufxLib and libhfs have been replaced.
- The CiderPress disk image library had some file update limitations, notably that files had to be written all at once. The new library returns a Stream object that can be used the same way it would for a file on the host filesystem.
- Compression code uses the same API as the standard System.IO.Compression classes, making it easy to integrate NuFX LZW or Squeeze compression into code that doesn't want the rest of the NuFX archive handling.
- The file conversion library returns platform-agnostic objects that can be converted to TXT/RTF/PNG/CSV, rather than directly generating Windows-specific bitmaps and RTF.