[OpenAFS] Re: 1.3.70 comments?
Steve Roseman
sgr0@Lehigh.EDU
Tue, 24 Aug 2004 15:10:47 -0400
Derrick J Brashear wrote:
> On Fri, 20 Aug 2004, Derrick J Brashear wrote:
>
>> Only idea I can come up with is that I somehow screwed up rx_Init; If
>> so, checking out the cvs head by date (2 weeks ago) should work. I
>> reproduced the problem, but sadly my AIX machine is 20 miles away and
>> didn't bother to come back up.
>
>
> Nope. Horst did testing and found it's this change:
> diff -u -r1.8 -r1.10
> --- rxkad.p.h 15 Jul 2003 23:16:42 -0000 1.8
> +++ rxkad.p.h 3 Apr 2004 07:59:53 -0000 1.10
> @@ -16,7 +16,7 @@
> /* no ticket good for longer than 30 days */
> #define MAXKTCTICKETLIFETIME (30*24*3600)
> #define MINKTCTICKETLEN 32
> -#define MAXKTCTICKETLEN 344
> +#define MAXKTCTICKETLEN 12000 /* was 344 */
> #define MAXKTCNAMELEN 64 /* name & inst should be
> 256 */
> #define MAXKTCREALMLEN 64 /* should be 256 */
> #define KTC_TIME_UNCERTAINTY (15*60) /* max skew bet. machines'
> clocks */
(AIX 5.1, openafs 1.3.70 + above + rx.c changes from 8/18)
Well, that's closer. login and klog work, as does (nominally, at
least) AFS file access.
But, after I go to AFS space and play around (and I think it has to be
non-cached access), and return to NFS space, the process (and I think
any NFS process from that point on) hangs up. The system is still
running, topas is showing nothing exciting, and as long as existing jobs
avoid NFS, I'm OK. Touch NFS, and it hangs. Here's a (to me
meaningless) kdb "proc" output for a hung "ls" command.
Steve
SLOT NAME STATE PID PPID PGRP UID ADSPACE CL
#THS
pvproc+008400 66 ls ACTIVE 042E2 034C8 042E2 00102 00001508 65 0001
NAME....... ls
STATE...... stat :07 .... xstat :0000
FLAGS...... flag :00200001 LOAD EXECED
........... flag2 :00000000
........... atomic :00000000
LINKS...... child :00000000
........... siblings :00000000
........... uidinfo :309CB640
........... ganchor :E2008400 <pvproc+008400>
THREAD..... threadlist :EA004000 <pvthread+004000>
DISPATCH... synch :FFFFFFFF
WLM........ class/wlm :41/0000
IDENTIFIER. uid :00000102 ........... suid :00000102
........... pid :000042E2 ........... ppid :000034C8
........... sid :000034C8 ........... pgrp :000042E2
MISC....... lock @ E20084EC 00000000
........... lock_d @ E2008554 00000000
...... parent_lock @ E2008550 00000000
...... session_lock @ E200854C 00000000
........... pgrpl :00000000
........... pgrpb :00000000
........... ttyl :00000000
........... ipc :00000000 ........... dblist :00000000
........... dbnext :00000000
STATISTICS. nframes :0000000000000013 ... npsblks :0000000000000000
........... nvpages :0000000000000013 ... auditmask :00000000
........... sched_next :00000000
........... sched_back :00000000
........... mempools @ E2008560 8000000000000000
......... usched_lock @ E2008508 00000000
........... uschedp :00000000
........... asyncio :00000000
CHECKPOINT. crid :00000000 ........... crid_token :FFFFFFFF
........... cridnext :00000000 ........... chksynch :FFFFFFFF
........... vpid :00000000 ........... vppid :00000000
........... vsid :00000000 ........... vpgrp :E2008600
PROCFS..... procfsvn :00000000
NUMA....... rset :00000000
PROC....... procp :05785E00 ........... size :00000148
FLAGS...... flag :00000000
........... flag2 :00000000
........... int :00000000
........... atomic:00000000
THREAD..... threadcount:00000001 ........... active :00000001
........... suspended :00000000 ........... terminating:00000000
........... local :00000000
SCHEDULE... nice : 60 ........... sched_pri : 255
DISPATCH... pevent :00000000
IDENTIFIER. pid :000042E2
MISC....... adspace :00001508
........... lgpage :00000000
SIGNAL..... infoq :00000000
........... pending :[3] 0000000000000000
........................[2] 0000000000000000
........................[1] 0000000000000000
........................[0] 0000000000000000
........... sigignore :[3] 0000000000000000
........................[2] 0000000000000000
........................[1] 0000000000000000
........................[0] 0000000008408800 SYS URG IO WINCH
........... sigcatch :[3] 0000000000000000
........................[2] 0000000000000000
........................[1] 0000000000000000
........................[0] 0000000000000000
........... siginfo :[3] 0000000000000000
........................[2] 0000000000000000
........................[1] 0000000000000000
........................[0] 0000000000000000
STATISTICS. page size :000000000000008B ... minflt :0000000000000051
........... majflt :0000000000000000 ... pctcpu :00000000
SCHEDULER.. repage :0000000000000000 ... sched_count:00000000
........... cpticks :0000.... ........... msgcnt :0000
........... majfltsec :00000000
........... rs_attinfo :00000000
CHECKPOINT. chkblock :00000000 ........... chkfile :00000000
PROCFS..... prtrcset :00000000
PVPROC..... pvprocp :E2008400 ............size :00000200
(0)>
(0)>
-----------------------------------------------------------------------------
Stephen G. Roseman
Lehigh University
sgr0@lehigh.edu