Midnight Beach logo

Kylix for Delphi programmers - The Unix file model

Unix files aren't strongly tied to their names, the way Windows files are. Where a Windows directory entry contains the name and a pointer to a sequence of disk blocks, a Unix directory entry contains a name and a pointer to an inode. The inode [Information Node] 'is' the file: it contains the file size, permissions, a pointer to a sequence of disk blocks, and one or two reference counts. On-disk inodes have only one reference count, which is the number of file names that the file has. (This count includes only hard links, not soft (or symbolic) links. A soft link is, essentially, a text file that contains the name of another file, which may itself be another soft link.) When a file is opened, its on-disk inode is read, and converted into an in-memory inode, which is functionally identical except that it also maintains a count of the number of processes that have the file open.

Thus, deleting a file doesn't necessarily delete the inode and the associated storage. Rather, what it does is to delete the directory entry and decrement the inode's name count. The inode is actually deleted only if both the name and process count are zero. Otherwise, it sticks around. Similarly, when the inode is closed its process count is decremented, and the inode will be deleted if the name and process count are both now zero.

When you open a file, the system looks up the filename, and gets the inode from that. What actually gets opened is the inode. If the same inode is known as both ThisFile and ThatFile, it's entirely possible for one process to change ThisFile and have those changes seen by another process (or even the same process) which has ThatFile open.

This is actually rather similar to the way Object Pascal's dynamic arrays work: If A := B, then changing A[0] also changes B[0]. Like dynamic arrays, the links are one way, from the directory entry to the inode. The inode knows how many names it has, but there's no way to tell what they are, short of scanning (potentially) the whole directory tree. Similarly, the name a file is created under is not privileged in any way. That is, there's no concept of "real name" every hard link has the same status as every other hard link.

Another way that the inode 'is' the file and the filenames are just pointers is that permissions belong to the inode, not to the filename. Thus, you can't have two different hard links to the same file with different permissions though a soft link can have different permissions than the file it links to.

No "application directory"

File links are a Unix feature that has no Windows counterpart. When you create a link to a file, you're basically just adding another name for an existing inode. This is a rather wonderful features and of course you can manipulate both hard and soft links from your Kylix programs just as you can from a shell prompt, but links also create a tough problem for your deployed programs.

In a deployed environment, it's common to install an arbitrarily complex directory tree wherever the user wants it, and then to just install a link to the main executable in some directory in the PATH. This keeps the PATH directories as small and fast to read as possible, and hides the details of your program from casual inspection.

It also creates a new problem. Where is your application? That is, Windows programmers are used to looking at ExtractFilePath(ParamStr(0)) or Application.ExeName to see where their executable is. They then load database and/or configuration files in the executable path or in a subdirectory. This doesn't work under Linux! The contents of ParamStr(0) are not mandatory, and apparently can vary even between versions of bash.

More, even if ParamStr(0) (or System.GetModuleFileName) could reliably return path information, it would suffer from a more basic problem: what the user typed to start the program may simply be a link from a PATH directory to the application directory. Getting the path where the link lives wouldn't help at all with finding the application directory! (You can use readlink to read soft links, but there's no way to do the same for hard links. Remember, a file has no "real name" every hard link is just as real as every other hard link.)

This combination of imperfect information and hard links means that while it still makes sense to store data files in the application directory (or in a subdirectory of the application directory), you can't use the Windows technique of using the application's path to find your configuration and/or data files. (Linux won't automatically load any .so's or packages from the application directory, either. You have to explicitly add the application directory to LD_LIBRARY_PATH via a 'branded' script that knows where the application was installed.) The Linux approach is to store configuration files in a fixed, or privileged location. Thus, an application called "turboschadenfreude" would normally be installed in /opt/turboschadenfreude, and would read global configuration data (like where the data files are) from /etc/opt/turboschadenfreude (yes, this only works if there aren't two or more different programs with the same name) and would read and write user configuration data (like window positions and a Most Recently Used files list) from and to the ~/.turboschadenfreude (or perhaps ~/.turboschadenfreude.ini) file or directory.

Created on September 12, 2002, last updated March 24, 2006 • Contact jon@midnightbeach.com