[OpenAFS-devel] Ideas on vos dump slowness

Harald Barth haba@pdc.kth.se
Tue, 17 Jul 2001 12:15:31 +0200


I experimented with some profiling. The following data was collected from
the volserver during vos dump of a 1Gb volume (to /dev/null on localhost).
Profiling was done on alpha_dux40 with hiprof+gprof and pixie+prof.

I think most of the time is spent in rxi_Start() which is called when
new packets are written (rx_rdwr.c:rxi_Write()) and when packets are
received (rx.c:rxi_ReceiveAckPacket()).

Total profile of all functions (have a look at the file in AFS):

bleak# gprof volserver volserver.hiout > \
/afs/pdc.kth.se/home/h/haba/Public/volserver.104a.gprof.out

More detailed profile of rxi_Start():

bleak# prof -pixie -asm -numbers -Only rxi_Start \
       /scratch/openafs-1.0.4a/obj/volser/volserver > \
       /afs/pdc.kth.se/home/h/haba/Public/volserver.104a.pixie.out

The output of prof points to
rx.c:rxi_Start() line 4837 till 4853

(gdb) list *0x120045514
0x120045514 is in rxi_Start (rx.c:4838).
4833            xmitList = (struct rx_packet **)
4834                       osi_Alloc(maxXmitPackets * sizeof(struct rx_packet *));
4835            if (xmitList == NULL)
4836                osi_Panic("rxi_Start, failed to allocate xmit list");
4837            for (queue_Scan(&call->tq, p, nxp, rx_packet)) {
4838              if (call->flags & RX_CALL_FAST_RECOVER_WAIT) {
4839                /* We shouldn't be sending packets if a thread is waiting
4840                 * to initiate congestion recovery */
4841                break;
4842              }

I know not enough about rx to do something about it, but there seems to be
some hints in the code that rxi_Start is better to be avoided:

rx_rdwr.c:rxi_Write():

                if (!(call->flags & (RX_CALL_FAST_RECOVER|
                                     RX_CALL_FAST_RECOVER_WAIT))) {
                    rxi_Start(0, call, 0);

During dump these flags are not active and rxi_Start is called. 

It looks to me like rxi_Start is called _a_lot_ and then takes approx
30 loops in
	for (queue_Scan(&call->tq, p, nxp, rx_packet)) { ... }

Any ideas of a cure or am I barking up the wrong tree?

Harald.