Git and RISC OS Filetypes

git is a Unix-oriented version control system, for obvious reasons. Consequently, RISC OS – sporting a noticeably non-Unix filesystem – doesn’t fit into git’s view of things. Particularly when it comes to RISC OS’s concept of filetypes…

Filetypes?

Whilst Unix filesystems utilise file extensions to identify the type of a file (“hello.txt“), RISC OS associates a 12-bit file type instead, which gets stored (somewhat invisibly) as part of the file’s details in the filing system. And while this doesn’t sound like much of a problem, the issue is that when files are stored in a git repository, from a RISC OS perspective git’s “blindness” to RISC OS filetypes starts to bite…

Why does it Bite?

When git stores an ordinary file in its repository it has absolutely zero concept of what “type” that file is: as far as git is concerned, a file, is a file, is a file. This is because within Unix-like environments, the type of the file is inferred from its file extension (for example a file called “hello.txt” is considered a text file because of the “.txt” at the end of the filename, whilst a file called “hello.sh” would be considered a Shell file).

But RISC OS, with its concept of filetypes rather than file extensions, doesn’t work this way.

If (by way of illustration) a BASIC file is created on RISC OS, it might be called “!RunImage”. And that’s it. It’s filename is “!RunImage”. Nothing more. Nothing less. Alongside storing the file content, RISC OS will store its associated filetype (in this case “0xffb” – a 12-bit value representing “a BASIC file”) as part of the file’s record. It may not be immediately obvious, but what’s important here is that RISC OS is storing the filetype separately from the filename.

And here’s the rub: ideally, when storing a file in git, the repository file should be called exactly what it’s called in the filesystem; in this example, the BASIC program was called “!RunImage”, so the file inside the git repository ought ideally also to be called “!RunImage”.

But because RISC OS deals in filetypes whilst git doesn’t, existing repositories used within the RISC OS community have necessarily gravitated toward the Unix approach whereby the RISC OS filetype is added as an extension of the filename – either as a 12-bit RISC OS filetype value or as a more human-readable MIME type – to ensure the repository has *some* kind of understanding of what type each file is. (So for example, the file “!RunImage” would be stored in the repository as a file called “!RunImage,ffb” or perhaps as “!RunImage,BASIC).

“Polluting” filenames in a repository in this way isn’t ideal.

So what’s the Solution?

Step forward: the .gitattributes file.

The .gitattributes file is an official git file that stores file metadata. In terms of RISC OS, this can be used to keep track of the 12-bit filetype of a file, allowing git repositories to record (cleanly and legitimately) RISC OS filetypes whilst maintaining global compatibility with other git implementations, and at minimal cost.

The way in which this can be utilised is by introducing a RISC OS-specific git command: git filetype

Looking Forwards – git filetype

With a git repository – new or existing – git can be instructed to begin recording the RISC OS filetype of a file using the following command:

git filetype --snapshot [--all | <filespec>]

Having executed this command, git will store the association of filename-to-filetype in the .gitattributes file. Having done this, the git commit command may be used to preserve officially this information in the repository. This is useful as it ensures an auditable trail of such changes in the repository, ie. the recording of a file’s type is as relevant an event to git as the changing of a file’s content.

When a branch is checked out – either explicitly by use of the git checkout command or implictly by use of the git clone command – the RISC OS git client will seek to ensure all files in the working directory reflect their filetypes (where such filetypes have been snapshotted) by effecting the equivalent manual command:

git filetype --apply --all

This command complements the --snapshot switch, informing the RISC OS git client that filetype mappings stored in the repository should be applied to files checked out in the working tree.

Note: Correspondingly, a new switch will be provided to the checkout and clone commands “--ignore-filetypes“, which will ensure that when a working tree is checked out, filetypes will not be set (should this be required).

This approach will allow the RISC OS git client to provide mechanism, but not policy: existing git repositories can continue to work exactly as they have done before, whilst new repositories (or existing ones) may choose to introduce the filetype mappings via use of the git filetype facilities. And of course non-RISC OS environments will be able to work with RISC OS git repositories with no change to their behaviour.

Looking Backwards – Compatibility

Existing repositories often carry a lot of history, and if the git filetype command is enforced at a particular point in time, the question remains of how will the RISC OS git client be helpful when dealing with existing repository content prior to that?

The answer here is relatively straightforward: when the RISC OS git client checks out a working tree – unless otherwise directed by an --ignore-filetypes switch – each file will be considered for its filetype in turn. In the first instance, if the .gitattributes file exists and a filetype is found this will be used for setting the filetype (as the use of this command will have been considered an explicit request to git for managing the file’s type), otherwise the RISC OS git client will fall back to the older approach of examining the filename to determine whether it can reliably apply a filetype to it.

For the latter process, the RISC OS git client will utilise a series of “file extension -> filetype” mappings that are defined within the local environment and which will can be configured by use of the git config riscos.filetypes command (where a file describing the filetype mappings may be supplied locally, ensuring that filetype mappings can be configured, if need be, on a per user basis). This is a RISC OS-specific configuration option, designed to ensure that individual users may determine how they wish file extensions to be mapped locally in their environment.

API Reference

git filetype [--snapshot | --apply] [--all | <filespec>]

git config riscos.filetypes <filename>

Leave a Reply