[OpenAFS-devel] Re: openafs / opendfs collaboration
Tom Keiser
Tom Keiser <tkeiser@gmail.com>
Fri, 21 Jan 2005 16:56:16 -0500
Ivan,
On Fri, 21 Jan 2005 10:27:33 +0100, Ivan Popov <pin@medic.chalmers.se> wrote:
> Hi Tom!
>
> On Tue, Jan 18, 2005 at 04:46:14PM -0500, Tom Keiser wrote:
> > Secondly, I know this is a rather drastic proposal, but is it time to
> > consider splitting the cache manager out of individual filesystem clients?
>
> What do you call a filesystem client and a cache manager in this context?
>
I'm (roughly) thinking of clients such as OpenAFS and OpenDFS as
several interacting components:
cache manager:
- responsible for storage of data and metadata
- responsible for cache replacement strategy
- API for interaction with implementation of VFS interface
- API for access by RPC endpoint for things like cache invalidation
credential manager:
- example would be the linux 2.6 keyring implementation
implementation of VFS interface:
- very os-specific stuff
- probably in-kernel unless something like LUFS takes off
RPC endpoint:
- listener for cache invalidations, etc.
RPC client library:
- client stub library
fs-specific syscall:
- PAG management, etc.
This is still an oversimplified view (where to put things like fsprobes?).
> I am afraid that different people (including myself) may think about
> very different things.
>
> > If the interfaces are abstract enough, we should be able to have multiple
> > distributed fs's using the same cache manager API.
>
> Do you mean any besides AFS and DFS?
>
These two are the most obvious. It's less clear whether other
filesystems would actually benefit from a cache manager complex enough
to handle AFS and DFS. It comes down to whether more lightweight
filesystems would benefit from a cache manager that sacrifices some
performance for caching aggressiveness. However, there's nothing to
preclude use of a pluggable algorithm or tunables to set what tradeoff
is desired.
> > help reduce the amount of in-kernel code for which each
> > project is responsible. Anyone else think this is feasible?
>
> Do you mean in-kernel cache management? Then probably no.
> Both filesystems and kernels are of great variety.
>
This is an argument best left for another day. Suffice it to say, I
don't think supporting M in-kernel filesystems on N os's is a
sustainable model. The less we depend on the subtle nuances of each
kernel's API, the better our chances of survival.
> If you mean a more general "cache bookkeeping library", then possibly yes,
> but still you'll get differences depending on how FSs and OSs distribute
> functionality between kernel and user space in a filesystem client.
>
This is what I was proposing in my initial post. Distributed
filesystems can benefit from an in-memory cache, but a larger cache
that survives reboots is often more appealing. Unfortunately,
utilizing os-specific cache tools is just going to increase autoconf
complexity, and produce even more ifdef soup. FS's like AFS and DFS
are so complex that we must have a common client codebase across
platforms. So, a cross-platform cache library that uses something
like the osi api for interaction with the rest of the kernel sounds
more feasible. I don't see the one-OS vision of many linux supporters
becoming a reality for several more years. So, instead I'm advocating
something that sacrifices performance for OS agnosticism (sounds a bit
like the ARLA philosophy...).
> If you mean the upcall interface (a common kernel module for different
> filesystems), then probably no - it reflects both the corresponding filesystem
> semantics and the corresponding kernel architecture...
>
I agree that the upcall interface will probably never be common. The
only way we could ever get there is the emergence of a
high-performance, cross-platform userspace filesystem API. Then maybe
we wouldn't feel compelled to put everything but the kitchen sink in
kernel-space ;)
> Though, less demanding filesystems can be happy with "foreign" kernel
> modules - like podfuk-smb or davfs2 using the Coda module.
>
While I was not trying to advocate a userspace implementation, I don't
think such an option should be ignored. But, I'm one of the last few
hold-outs who like the elegance of the microkernel architecture.
Crossing the kernelspace/userspace boundary can be optimized. If you
want speed and parallelism, the userspace/kernelspace boundary could
be crossed using something like asynchronous message queues. Granted,
there's not much reason for hope right now, but it sure would make
everyone's lives easier if a good userspace filesystem driver API
existed on multiple platforms. Yes, it will always be slower than
running in-kernel, but the reduction in maintenance to keep up with
rapidly changing kernel APIs should free up more people's time to work
on a better cache manager. Not to mention, debugging and profiling
userspace code is soooo much easier.
Regards,
--
Tom