[OpenAFS] Number of files in an OpenAFS volume...

Matthew Lowy matthew.lowy@it.ox.ac.uk
Mon, 4 Jul 2016 15:57:11 +0000


Hi Jeffrey,=0A=
=0A=
(directory, my error)=0A=
=0A=
Ah, that makes more sense... if I understand corerctly, it makes it look li=
ke each slot is 32 characters long, the first 16 characters of the first sl=
ot are occupied by other file metadata so only 16 are available for file na=
me data but in each subsequent slot all 32 characters are available for fil=
e name data. On that basis 1 - 16 character names take one slot, 17 - 48 ch=
aracter names take 2, 49 - 80 character names take 3 slots, 81 - 112 chaarc=
ter names take 4 slots and so on. I presume that if we were working in a lo=
cale where double-byte characters were in file names the packing wouldn't b=
e as good.=0A=
=0A=
Incidentally, isn't the function equivalent to:=0A=
=0A=
{=0A=
    int i;=0A=
    i =3D strlen(name);=0A=
    return 1 + ((i + 16) >> 5);=0A=
}=0A=
=0A=
Implementing the algorithm below against the real data indicates that we ar=
e using 63,959 slots which is even more painfully close to 64k and explains=
 why the mirror update sometimes fails with error messages relating to runn=
ing out of slots..=0A=
=0A=
Thank you, that resolves the anomaly... and as a team we'd already discover=
ed there are things about OpenAFS that limit what it can be applied to... t=
here have been some restrictions and tweaks needed in the mirroring applica=
tion. I'll point my manager at Auristor for the next tech refresh.=0A=
=0A=
Matthew=0A=
________________________________________=0A=
From: Jeffrey Altman [jaltman@auristor.com] =0A=
Sent: 04 July 2016 15:27=0A=
To: Matthew Lowy; openafs-info@openafs.org=0A=
Subject: Re: [OpenAFS] Number of files in an OpenAFS volume...=0A=
=0A=
On 7/4/2016 6:55 AM, Matthew Lowy wrote:=0A=
> Hello,=0A=
>=0A=
> We have a number of OpenAFS volumes that serve as storage for (public)=0A=
> mirrors and one of them is misbehaving when updated from upstream - the=
=0A=
> error indicates we've reached the limit of file names allowed in a volume=
.=0A=
=0A=
In a volume or a directory?=0A=
=0A=
The theoretical limit of directories in a volume is 2^30 and=0A=
non-directories in a volume is 2^30.  There have been incomplete efforts=0A=
to raise those limits by treating signed values as unsigned values but I=0A=
wouldn't count on them.=0A=
=0A=
> The limit I am seeing is not compatible with my understanding of how=0A=
> OpenAFS handles file names in a directory. I've seen in the mail list=0A=
> archives the statements about how many file names can fit, that there=0A=
> are 64k slots and a file name < 16 in length occupies one slot, a file=0A=
> name from 16 to 32 characters long occupies two slots and so on. The=0A=
> earliest reference I've found is at=0A=
> http://lists.openafs.org/pipermail/openafs-info/2002-September/005812.htm=
l=0A=
=0A=
Your understanding is roughly correct except that the numbers Todd=0A=
specified are wrong.   The actual number are determined by this function:=
=0A=
=0A=
/* Find out how many entries are required to store a name. */=0A=
int=0A=
afs_dir_NameBlobs(char *name)=0A=
{=0A=
    int i;=0A=
    i =3D strlen(name) + 1;=0A=
    return 1 + ((i + 15) >> 5);=0A=
}=0A=
=0A=
A slot contains per file metadata followed by name data.  When multiple=0A=
slots all of the space in the 2nd and subsequent slots are used for file=0A=
name data.=0A=
=0A=
> However...=0A=
>=0A=
> The directory concerned has more than 21,000 files in it, almost all of=
=0A=
> them have names exceeding 52 characters... as at today there are=0A=
> 1,220,000 characters in filenames in that directory. Even assuming they=
=0A=
> pack down perfectly into directory name slots that's over 76,000=0A=
> slots... and working them out using the rule above indicates that the=0A=
> directory is using over 87,000 slots. These are both significantly above=
=0A=
> 64k.=0A=
>=0A=
> I don't know if I'm misinterpreting the information in the OpenAFS=0A=
> archive or if the information is out of date - but I've not found=0A=
> anything that fundamentally is different from the information in the=0A=
> archive and I'm looking at a volume that seems to break the limits.=0A=
=0A=
The AFS3 directory format is part of the wire protocol as it is shared=0A=
by both the file server and the clients.=0A=
=0A=
> I'd really benefit from understanding what's going on ... how we appear t=
o=0A=
> be getting more file name information into a directory than should be=0A=
> possible.=0A=
>=0A=
> /mirror.ox.ac.uk/sites/archive.ubuntu.com/ubuntu/pool/main/l/linux$ ls=0A=
> |wc -l=0A=
> 21731=0A=
=0A=
This number is within the existing limits.=0A=
=0A=
> /mirror.ox.ac.uk/sites/archive.ubuntu.com/ubuntu/pool/main/l/linux$ ls=0A=
> |wc -c=0A=
> 1250894=0A=
=0A=
File names of length 52 through 70 characters require three slots.  If=0A=
all of the file names are of length 60 and are perfectly packed they=0A=
would require 62545 slots which is very close to the limit.=0A=
=0A=
> This is one directory in a mirror of archive.ubuntu.com so you can see=0A=
> the contents from (e.g)=0A=
> https://launchpad.net/ubuntu/+mirror/mirror.ox.ac.uk-archive which=0A=
> points to the presentation of our mirror. The number of files has=0A=
> recently gone up because of upstream changes.=0A=
=0A=
The directory size restrictions are one of the reasons that /afs cannot=0A=
be used for a large number of applications.  The AuriStor File System=0A=
implements a new directory format which is understood only by AuriStor=0A=
clients.  This format permits directories to grow to store an unlimited=0A=
number of entries.  However, the AuriStor file servers currently apply=0A=
an artificial limit of approximately 20 million entries.=0A=
=0A=
More details on the AuriStor File System can be obtained at=0A=
=0A=
  https://www.auristor.com/openafs/migrate-to-auristor/=0A=
=0A=
Jeffrey Altman=0A=
=0A=
=0A=