[OpenAFS-devel] Ideas on vos dump slowness
Harald Barth
haba@pdc.kth.se
Tue, 17 Jul 2001 12:15:31 +0200
I experimented with some profiling. The following data was collected from
the volserver during vos dump of a 1Gb volume (to /dev/null on localhost).
Profiling was done on alpha_dux40 with hiprof+gprof and pixie+prof.
I think most of the time is spent in rxi_Start() which is called when
new packets are written (rx_rdwr.c:rxi_Write()) and when packets are
received (rx.c:rxi_ReceiveAckPacket()).
Total profile of all functions (have a look at the file in AFS):
bleak# gprof volserver volserver.hiout > \
/afs/pdc.kth.se/home/h/haba/Public/volserver.104a.gprof.out
More detailed profile of rxi_Start():
bleak# prof -pixie -asm -numbers -Only rxi_Start \
/scratch/openafs-1.0.4a/obj/volser/volserver > \
/afs/pdc.kth.se/home/h/haba/Public/volserver.104a.pixie.out
The output of prof points to
rx.c:rxi_Start() line 4837 till 4853
(gdb) list *0x120045514
0x120045514 is in rxi_Start (rx.c:4838).
4833 xmitList = (struct rx_packet **)
4834 osi_Alloc(maxXmitPackets * sizeof(struct rx_packet *));
4835 if (xmitList == NULL)
4836 osi_Panic("rxi_Start, failed to allocate xmit list");
4837 for (queue_Scan(&call->tq, p, nxp, rx_packet)) {
4838 if (call->flags & RX_CALL_FAST_RECOVER_WAIT) {
4839 /* We shouldn't be sending packets if a thread is waiting
4840 * to initiate congestion recovery */
4841 break;
4842 }
I know not enough about rx to do something about it, but there seems to be
some hints in the code that rxi_Start is better to be avoided:
rx_rdwr.c:rxi_Write():
if (!(call->flags & (RX_CALL_FAST_RECOVER|
RX_CALL_FAST_RECOVER_WAIT))) {
rxi_Start(0, call, 0);
During dump these flags are not active and rxi_Start is called.
It looks to me like rxi_Start is called _a_lot_ and then takes approx
30 loops in
for (queue_Scan(&call->tq, p, nxp, rx_packet)) { ... }
Any ideas of a cure or am I barking up the wrong tree?
Harald.