FUSE is a library with a set of function that let you reimplement the VFS layer operations, in other words it lets you write a userspace filesystem, not doing so in kernel space is really flexible, as we can use any external library.
I first became aware of Fuse when I read about GmailFs, which uses a Gmail account as a structured storage, thus becoming a normal mountpoint in your system. GmailFs uses Fuse-Python which are a set of bindings to use Fuse under python. AVFS is another example of Fuse filesystem, it uses tar files as if they were drives, so they can be mounted and accessed without having to untar them.
What I want to use Fuse for, is to make the file indexer more autonomous, with the help of Fuse one might be able to have a fuse mountpoint where all files are always indexed. Once you perform a write operation the indexer might be able to re index the changed portion, when you delete a file it’ll be automatically out of the indexer, and so on. This way there is no need to have the indexer as a daemon which checks for changed files to reindex.
To accomplish this I need to implement an indexing hook on the following operations:
Write
Unlink
Rename
Trunacte
At the end we would have a regular directory, but always synced with the indexing database.
Implementation:
The main goal is just to make a prototype of how would a searchable filesystem look like, is for this reason I have chosen Python ( and also because I’m in process of learning it ). The implementation will make as use of an existing filesystem ( aka reiser, ext, …). Fuse won’t have much to do, just hook all access from one directory to another.
Imagine we want to have an indexed directory called Documents, we would have a mirror in .Documents, where the files are stored using our normal filesystem. We would work normally under Documents, when doing a write operation the hook would catch it and reindex the file, after that the file is saved in the mirror directory. The mirror directory makes our implementation idependent from the filesystem that the user is more comfortable with, we also avoid any problem with data corruption because all the data is stored in a reliable filesystem, we are adding a layer on top of our filesystem, not reinventing the wheel.
Code:
Code for the Indexing Filesystem
References
Fuse SF project
LWN Article on Fuse
Fuse Documentation
GmailFs
You may also be interested in this new feature for Reiser4: Plugins!
Namesys
This is great! Thanks for your post. I am a newbie at python and this will be a big help.
Hey, I go over all your blog posts, keep them coming.
I have a different view, but I respect you for posting this story.