[NetBSD logo]    &    [Google logo]

NetBSD-SoC: GNU/Hurd translators

What is it?

GNU/Hurd's translators.

Translators are programs which provide filesystems in user space functionality and are even a step further. Each inode can have a translator associated with it, which means that instead of doing VFS operations on this particular inode, they are handled by the translator, a userland program. This may act just like mounting, which means that the association disappears after a reboot or it can be associated persistently. The former is called an active translator and the latter is called a passive translator.
The 3 main differences between translators and regular filesystems in userspace are:

The project

The objective of this project is to make NetBSD translator aware. This means VFS modifications, updating userland tools like fsck for managing filesystems modifications, userland tools to enable translator handling and GNU/Hurd binary compatibility to run GNU/Hurd's translators in NetBSD.

Status

Deliverables

Mandatory (must-have) components:

Optional (would-be-nice) components:

Documentation

Two main things that appeared in the kernel are new VOPs and new syscalls for handling the tranlators. They are very simple.

Syscalls

VOPs

They correspond to the syscalls in a natural way.

NetBSD specific translators' documentation (a formatted copy of file design.txt from the project)

Overview

Translators may be implemented using puffs. They are implemented just like any other puffs filesystem with the restriction, that they don't rely on any open file descriptors, environment nor anything else which is inherited during exec. This is a natural consequence of the fact, that it is impossible to choose a proper process from which those thing should be inherited when activating a passive translator. The second restriction is that mountpoint is passed as the last argument to the program.

How it is done in the kernel:

The kernel structures had to be extended in order to allow synchronization of the translators; the synchronization is needed e.g. for waiting with completing lookup() until the translator settles. The extensions are: In struct vnode: In struct proc: Two additional flags were added to the vnode's flags.

General concepts

Whether a vnode is translated or not is determined by v_trans_list and VV_TRANSBUSY flag. If the list is not empty, it means that there are processes which are either a translator or may become one that is they hold the right to be the vnode's translator. vnode may be in a state when the translator is just being started, which means that the vnode is not yet translated, but there are processes holiding the right to become a translator (see below). Then, VV_TRANSBUSY flag is set. This means, that the translator has not yet called a successful puffs_mount. In order to check if the vnode is translated you have to check if the list is not empty and flag VV_TRANSBUSY is unset. In the VV_TRANSBUSY state, lookup() will block on this vnode and so will any translator related functions. The flag is managed during exit1, puffs_mount and sys_unmount.

When a translator is started it is always spawned by the kernel as a child of init. Therefore, when settrans is called, no caller's attributes are inherited.

Translator concept in detail

It is not a very uncommon scenario, that you may want to use a script as a translator. Then, forks are unavoidable, so they need to be taken care of. I decided to do the following: if a process is started, it is granted a right to become a translator of the vnode it was started for. It is inherited on forks and execs. Processes may use the right multiple times, but only one process at a time. The right is nothing more but having a reference to the vnode in p_trans. It might happen, that the launched process and its children will never call puffs_mount. Keep in mind, that the vnode is locked until puffs_mount is called. When we discover, that no one will be able to call it (because the original process and it's children died), we may signal everyone who's waiting for the translator to settle, that we didn't make it. (by simply removing the VV_TRANSBUSY and signaling them. In order to achieve that, a list of translators processes is managed (in fork and exit).

Locking

Vnodes are locked in a fashion which can't be used with cv_wait (AFAIK). Therefore, I've added v_trans_mtx field to the vnode structure to make use of v_trans_cv. This condition variable is used for waiting on translators to settle (which basically means calling puffs_vfsop_mount). This mutex allowed me to provide waiting on the condition variable and calling VOP_UNLOCK atomically enough to avoid races. The relevant functions are:

When are passive translators activated?

In the lookup() function. The code is very much straight forward: if there is a passive translator (determined by the VV_TRANS) flag, there is no active translator and no active translator is being set (VV_TRANSBUSY flag), start the passive translator. This part is not very elegant, because of the fact, that we may not have the absolute path to the mountpoint there, so what we need to do is the same what in getcwd. getcwd's code had to be modified to be able to be called from lookup(): lock had to changed to recursive. There are two attempts to get the passive translator: the second is if the translator command doesn't fit in one page (which is how much we assume in the first try). Then we know exactly how much space we need.

Access to the lower level

Not yet implemented, but will be soon. The idea is very simple: as it doesn't make any sense for a puffs filesystem to do lookup on itself, every lookup launched by puffs filesystem would be treated as if this filesystem was nonexistent. In order to access the lower level, you simply open the file by pathname. It might be a good idea to introduce some flags for this purpose. It still requires some thinking.

Translator stacking

No special care has to be taken, as it is simply all about stacking mountpoints, which works.

Bugs

The userland tools have man pages.

Translators' interface

How is the translator stored in ext2

There is a special field in the ext2fs inode: e2di_translator. It holds a block number. The block to which it points to has the following format:

fsck_ext2fs checks for the following errors:

It doesn't check if the contents of the translator block are correct because e2fsprogs' e2fsck doesn't do it either and it cannot do anything bad to the rest of the filesystem.

What is known about GNU/Hurd's emulation

GNU/Hurd works on top of GNUMach microkernel. Here I would like to summarize what I have learned about the current state of implementation, what have I already done, what should be doable easily and what can be difficult. Contrary to the original Mach, GNUMach uses ELF instead of mach-o.
SourceForge.net Logo
Marek Dopiera <siersciu@gmail.com>
$Id: index.html,v 1.11 2008/09/17 17:36:11 siersciu Exp $