[OpenAFS] Solaris 10 11/06 afs 1.4.2 pam module panic.
Kris Kasner
tkasner@qualcomm.com
Mon, 18 Dec 2006 15:52:02 -0800 (PST)
Thanks for the reply!
Today at 18:18, Marcus Watts <mdw@umich.edu> wrote:
> What happens when you klog? Or aklog? Does it crash then?
> Does it crash immediately upon doing klog/aklog, or upon
> the first file reference after that?
As yet I've been unable to get the system to crash with the pam entries
removed.
I removed the pam entries, put a local password for my account in /etc/shadow,
logged in. upon klog I can get to my homedir and everything else.. Doing lots
of reads/writes from my afs home dir have no effect on the system.
I should have also mentioned, I'm running 1.4.2 on many solaris 10 (SPARC and
x86) systems without issues, this is the first 11/06 release I've tried which
uses the pam module
> Does it die if you use a cache manager from openafs 1.4.1 ? 1.2.13 ?
I can try 1.4.1 (I think I still have it lying around.. :) but I didn't think
the 1.2.x versions played well with solaris 10 at all.. 1.4.2 has some major
user level panic type bug fixes in it..
I added this line to /etc/pam.conf (we use this on all of our systems that auth
against afs, works just fine..)
sudo auth required /usr/afsws/lib/pam_afs.so.1 ignore_root
and typed "sudo true"
when I hit enter from typing my password, I was greeted with another panic
15:46:46 ui234(27)> sudo vi /etc/pam.conf
...
15:47:46 ui234(28)> sudo -k
15:47:50 ui234(29)> sudo true
AFS Password:
panic[cpu0]/thread=3000162e6a0: BAD TRAP: type=34 rp=2a1002978b0 addr=33
mmu_fsr=0
sudo: alignment error:
addr=0x33
pid=1255, pc=0x10b3cb0, sp=0x2a100297151, tstate=0x80001605, context=0x9a0
g1-g7: 42, 42, 0, 210, 0, 0, 3000162e6a0
000002a1002975d0 unix:die+9c (34, 2a1002978b0, 33, 0, 2a100297690, c1e00000)
%l0-3: 00000000c0800000 0000000000000034 0000000000000000 00000300000715f0
%l4-7: 0000030000071640 0000000000000000 000000000000f000 0000000001076000
000002a1002976b0 unix:trap+690 (2a1002978b0, 10009, 0, 80000b, 0, 3000162e6a0)
%l0-3: 0000000000000000 0000060004c51bc0 0000000000000034 0000060004dee710
%l4-7: 0000000000000000 0000000000000000 000000000000f000 0000000000010200
000002a100297800 unix:ktl0+48 (6000373bcf0, 0, 4e8, 42, 42, 3)
%l0-3: 0000000000000006 0000000000001400 0000000080001605 000000000101aa04
%l4-7: 0000060000830000 0000000000001080 0000000000000000 000002a1002978b0
000002a100297950 genunix:getproc+11c (2a100297ad8, 0, 60004c51bc0, 60004e007b8,
60004c51bc0, 1837400)
%l0-3: 000006000373bcf0 00000000018a5c00 0000000000000000 ffffffffffffffff
%l4-7: 0000060004e007d0 0000060004e00bc8 00000000000004e8 0000000000000000
000002a100297a00 genunix:cfork+94 (0, 1, 0, 1, 60004c51bc0, 0)
%l0-3: 0000000000000000 0000000000000000 00000000ae540000 000000000000ae54
%l4-7: 0000000000000001 0000000000000000 0000000000000000 0000000000000000
Thanks again for the input
--Kris
> trap type 34 = "memory address not aligned". Your callback looks
> like it dies somewhere while trying to send a udp datagram,
> which doesn't totally make sense.
> It's possible that some part of openafs has somehow managed
> to construct a udp datagram that contains words that are not aligned
> on word boundaries. The only way that could be associated with
> klog/aklog would be if some part of rxkad managed to do this.
> It's possible I'm reading this wrong and something quite different
> actually happened. My expectation is that things that could
> go wrong with klog/rxkad would happen a bit quicker.
>
> Note: I don't have any solaris 10 machines -- so other than
> asking questions/suggesting ways it might die I may not be able
> to help you.
>
> -Marcus Watts
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
>
--
Thomas Kris Kasner
Qualcomm Inc.
5775 Morehouse Drive
San Diego, CA 92121
(858)658-4932
"There are many intelligent species in the universe.
They are all owned by cats."
--Anonymous