[OpenAFS-devel] dumpstuff.c: ProcessIndex observations
Peter Somogyi
psomogyi@gamax.hu
Wed, 23 Nov 2005 17:57:50 +0100
Hi,
I'm analyzing now volser/dumpstuff.c: ProcessIndex funtcion.
I've came to some conclusions:
1. It can handle only max. 2^32 / 256 = 2^24 large vnodes + 2^32 / 32 = 2^27 small vnodes per volume, since a variable ("size") inside is a simple int.
2. This function is wasting memory (4 bytes / vnode is allocated for a vos move time), since 1 bit would be enough instead of 4 bytes (see also 3.), with no time cost gain:
...
Buf = (afs_int32 *) malloc(nVnodes * sizeof(afs_int32));
...
3. The code below seems it wanted something else than it implements:
...
Buf = (afs_int32 *) malloc(nVnodes * sizeof(afs_int32));
if (Buf == NULL)
return 1;
memset((char *)Buf, 0, nVnodes * sizeof(afs_int32));
STREAM_SEEK(afile, offset = vcp->diskSize, 0);
while (1) {
code = STREAM_READ(vnode, vcp->diskSize, 1, afile);
if (code != 1) {
break;
}
if (vnode->type != vNull && VNDISK_GET_INO(vnode)) {
Buf[(offset >> vcp->logSize) - 1] = offset;
cnt++;
}
offset += vcp->diskSize;
}
*Bufp = Buf;
*sizep = nVnodes;
...
- the variable "cnt" is not used
- note: offset is always multiple of vcp->diskSize, so the above assignment seems to be a memory waste, since array index determines the value (if not zero)
- "Buf" is used to store only that information whether we should keep that vnode or not
4. It won't delete "old vnodes"(=not existing on RW side) in the following cases, but will try to proceed normally:
- can't allocate the memory above (it won't report any error!)
- fails to open the vnode index file (fatal error, but it would continue normally)
- gets a new volume to process in the same dump (I think this feature is totally experimental, some variables are not prepared for the change - e.g.: iodp->device, iodp->parentId)
So I think instead of allocating 4 bytes / vnode, it would be enough 1 bit for that.
Are my statements correct?
My questions is:
- how many vnodes (= files + directories + deleted files(?) + deleted dirs(?)) can/may we have in a volume in a production environment in practice?
- are the above limitations harmful?
Peter