[OpenAFS-devel] RFC: move rx epoch/cid generation into the rx core

Benjamin Kaduk kaduk@MIT.EDU
Tue, 11 Feb 2014 16:46:31 -0500 (EST)


Hi all,

One of the items on http://wiki.openafs.org/RXGKToDo/ is to move the 
generation of the rx epoch and connection ID into the rx core, removing 
this burden from the rxkad code (which was previously the only reliable 
source of random numbers).  I picked up my old attempt to do this and 
fixed it up, resulting in the patchset at 
https://github.com/kaduk/openafs/commits/epoch (commit 31fb6c1).

I havne't pushed it to gerrit yet, because the in-kernel rand-fortuna 
requires a patch in upstream heimdal 
(https://github.com/heimdal/heimdal/pull/61), the best resolution of which 
is still under discussion.  I would like to get some review of the code 
itself, though, so I send mail here.

We have to pull rand-fortuna (and sha256) into the kernel for the few 
systems which don't have a useful osi_readRandom().  rand-fortuna.c 
uses locking via the HEIMDAL_MUTEX_* macros; since we don't have many 
consumers of those in the kernel right now (i.e., just rand-fortuna) I 
decided to go for a single global mutex shared by all consumers of the 
HEIMDAL_MUTEX type.  Upstream heimdal uses this type more broadly, so if 
we ever start importing more code that uses it, we'll need to revisit that 
decision.  (Or we could revisit it now; I don't have particularly strong 
feelings, and this is just what I ended up with over the course of my 
development.)  I'm also not very familiar with what the kernel startup 
sequence looks like (and how consistent it is across OSes), so I just 
stuck the mutex initialization in local code that is called before 
anything in rand-fortuna is reached.  Is there a better place for 
something at this level?

My initial prototype for AFSOP_SEED_ENTROPY didn't do any copyin(), it 
just passed a few words of entropy directly.  Since rand-fortuna wants 128 
bytes of input before it considers itself fully seeded, passing a buffer 
seems like the way to go (instead of calling the syscall 32+ times in a 
loop).  We may also want to look at how to ensure that AFSOP_SEED_ENTROPY 
happens before any consumers of RAND_bytes in the kernel, since 
rand-fortuna ~lies and claims it's always seeded, even though it starts 
off with zero entropy due to our neutering of its initial seeding sources. 
I ran a test on my system of what afs syscalls were made, and this was the 
second one, with the first one being the AFSOP_SET_THISCELL easily visible 
in afsd.c.  Since that doesn't fire up rx, we are safe for now, but how 
future-proof are we?


I think that the other three patches are comparatively uninteresting, but 
I could always be proven wrong. :)

Any thoughts?

-Ben

P.S. testing the rand-fortuna requires changing the preprocessor 
conditional in crypto/hcrypto/kernel/rand.c:RAND_bytes() on the systems I 
expect people to be using; no other changes should be needed.