[OpenAFS-devel] GSoC 2025 Application: OpenAFS Linux Kernel Module - Multi-page Folio Support Implementation

Cheyenne Wills cwills@sinenomine.net
Fri, 14 Mar 2025 15:10:06 -0600


On Fri, 14 Mar 2025 11:13:25 +0530
Sushil Pandey <contact.sushilpandey@gmail.com> wrote:
> Dear OpenAFS team,
> I'm Sushil Pandey, a Computer Science engineering student with
> experience in Linux kernel development, particularly in memory
> management subsystems. I'm writing to express my interest in the
> "OpenAFS Linux kernel module: Multi-page folio support" GSoC project.
> 
> During my previous internship, I worked on optimizing page cache
> utilization for a custom storage driver, which gave me hands-on
> experience with the Linux kernel's memory subsystem and folio API
> transitions. I've also contributed patches to fix memory leaks in the
> NFS client code, which required understanding similar VFS interfaces
> that OpenAFS interacts with.


Hello, Sushil Pandey

Thanks for your interest in the GSoC for OpenAFS.  It sounds like your
prior experience would be a great fit for the project.

I'll try to answer some of your questions.

> 
> I've been analyzing the OpenAFS codebase, specifically the memory
> management in afs/LINUX/osi_vm.c and afs/LINUX/osi_file.c, and see
> several opportunities where multi-page folios could improve
> performance, particularly in the readpage/writepage implementations
> and during bulk data transfers. I believe the afs_linux_storeproc()
> and afs_linux_fillpages() functions are key candidates for optimizing
> with multi-page folios.

It's good that you are getting familiar with openAFS code base, and
have started identifying some of the areas that could benefit the use
of multi-page folios.

> 
> I have some specific questions regarding implementation details:
> 
> 1. Has OpenAFS considered an incremental approach where multi-page
> folios are first implemented for read operations before tackling the
> more complex write paths with their consistency requirements?

The current work for supporting folios within openAFS has been just
maintaining backward compatibility with single page operations and
implementing just enough folio support to satisfy the current Linux
kernel's requirements.  So far this has been a fairly straight forward
process, but it hasn't offered the chance to take full advantage of
folios.  

The development methodology used within openAFS is very incremental at
a commit level (you can take a look at gerrit.openafs.org to see the
how large changes are introduced in small chunks).

> 2. Are there specific benchmarks or workloads you're targeting for
> performance improvements with multi-page folios, such as large
> sequential reads or specific application patterns?

There hasn't been real discussion on specific benchmarks.  Part of the
project would be identifying what types of improvements are possible.
One common use case of openAFS that might benefit is application
deployment where there are a lot of sequential reads as the
application are pulled into memory in order to run.

> 3. How should the implementation handle kernel version compatibility,
> particularly for kernels before 5.16 where folio support was still
> evolving? Should we maintain parallel code paths or use feature
> detection?
> 

OpenAFS has a stated backward kernel compatibility (currently going
back to Linux 2.6.18)  This handled via autoconf tests to detemine
available features within the Linux kernel and preprocessor
conditionals within the code itself.  

For this particular feature (multi-page folio support), there would be
an autoconf configure option to enable/disable it at configure/build
time (if the kernel supports it). In some of the discussions for this
project, the idea was brought up to have a new set of core functions
that handle the multi-page folios instead of trying to intermix code
with the existing single-page folio routines (though we would want to
try to retain as much code-reuse as possible).  This would allow the
development for this new feature to focus more on the direct
implementation and not worry too much on the backward code compatibility
aspects.

> I've already started drafting a 12-week timeline for this project,
> dividing the work into analysis, implementation, and testing phases
> with specific milestones for each component of the file system. I
> would greatly appreciate your feedback on this timeline and any
> guidance on critical areas that should receive priority attention.
> 

Sounds like you have a good start.  

There are general areas that will need to be looked at.
 - How openAFS presents data at the VFS level.  This is probably the
   area that will benefit the most from multi-page folios.
 - How openAFS handles its own disk caching.  Here openAFS is reading
   and writing to an underlying filesystem, so folio usage may be more
   nuanced.

The area that handles the above are in the cache manager which is
located in src/afs, with the Linux specific functions and interfaces
being located in src/afs/LINUX.

The one area in particular you need to pay attention to is that
openAFS cannot use any of the kernel's GPL interfaces (due to
restrictions with the current openAFS licensing).

In addition to the code base, I would recommend reviewing the
contributor guide, which can be viewed at the following link.
https://github.com/openafs/openafs/blob/master/doc/process/01-contributor-guide.md

More details on any development workflow, etc. for the project would be
handled in discussion with the mentor assigned to the project.

> Thank you for considering my interest in this project. I'm excited
> about the opportunity to contribute to OpenAFS's performance on
> modern Linux kernels.
> 
> -- 
> 
> *Warm regards,Sushil Pandey*

You are more than welcome.  Thank you again for your interest.

-- 
Cheyenne Wills
cwills@sinenomine.net