[OpenAFS] Openafs-1.6.5 client crash when update the OPTIONS in afs config file

Stephan Wiesand stephan.wiesand@desy.de
Sat, 27 Dec 2014 20:41:09 +0100


On Dec 27, 2014, at 13:43 , Sergio Gelato wrote:

> * huangql [2014-12-24 17:46:19 +0800]:
>> I failed to restart afs service after I changed OPTIONS value in =
/etc/sysconfig/afs file.
>=20
> What was the old value, and what did you change it to?

I second this question, as well as the others.

>> At this time, I need to reboot the machine to make the new =
configuration validate.
>=20
> Are you saying that afsd crashes on service restart but not when it is =
started
> for the first time after a reboot (with the same options)?
>=20
>> Openafs version: 1.6.5
>=20
> A bit old. You may want to check the change logs of later versions for
> potentially relevant bug fixes.

Looks like the ordinary SL6.{x|x<=3D5} packages. They're not supposed to =
crash under normal circumstances. Updating to 6.6 should bring the =
OpenAFS client to version 1.6.10, and there are indeed many fixes in =
there that should make the client fail with an error message rather than =
a panic or a segfault. But the culprit is most likely bad input from =
/etc/sysconfig/afs in either case. So, again: what't that file's =
content?

>> Os version: Scientific Linux release 6.5 (Carbon)  =
2.6.32-431.el6.x86_64=20
>>=20
>> I got the error message as following:
>>=20
>> [root@bws0609 ~]# /etc/init.d/afs restart
>> Stopping AFS client.....=20
>> Sending all processes using /afs the TERM signal ...       [  OK  ]
>> Sending all processes using /afs the KILL signal ...       [  OK  ]
>> Starting AFS client.....=20
>> /etc/init.d/afs: line 230: 26271 Segmentation fault      =
/usr/vice/etc/afsd ${AFSD_OPTIONS}

You shouldn't run the init script directly. Use "service afs restart" =
instead.

> Has a core file been left behind? If so, could you extract a backtrace =
from it?
>=20
>> Dec 24 17:30:29 bws0609 kernel: Starting AFS cache scan...
>> Dec 24 17:30:29 bws0609 kernel: afsd[26271]: segfault at 18 ip =
0000003736679753 sp 00007fff5f346fa0 error 4 in =
libc-2.12.so[3736600000+18a000]
>=20
> To me this looks like an attempt to dereference a null pointer to a =
struct
> (with the component of interest being at offset 0x18). A backtrace =
might
> help one figure out where that unexpected null pointer came from.

--=20
Stephan Wiesand
DESY -DV-
Platanenenallee 6
15738 Zeuthen, Germany