[OpenAFS] Solaris 10 deadlock issue

Patricia O'Reilly oreilly@qualcomm.com
Tue, 14 Jun 2011 16:54:29 -0700

Ok. Since it is U8, it sounds like its isn't the same problem we've seen here. 

If anyone is interested... it manifests itself in a slow death of the machine until the thing locks up completely. The scheduler keeps sending jobs to sleeping CPUs, at which point, it never returns.

The summary is that the scheduler can send jobs to cpu's which just got put to sleep by a power-management bug only new to Nehalem. All OS's which implemented the spec "Intel power management for Nehalem extensions" are susceptible to this problem. We started seeing it about a year ago (no one here told me though, until it started locking up my file servers). Solaris 10 U8 and before did not experience the bug, but U9 did.

Intel has a CPU microcode update, and it is included in *some* vendors BIOS (HP), but not all vendors do, or talk much about it.  For now with Solaris on sun equipment we have to do a workaround.  It will be fixed in an upcoming Solaris10 patch which works around it in the scheduler code. Here is the public bug report:

Bug 6958068: Nehalem deeper C-states cause erratic scheduling behavior

The workaround for us is to modify the /etc/power.conf.


Aaron Knister wrote:
> The box in question is an x86 VM running in VMware ESXi on a host with
> dual Opteron CPUs. I have also reproduced it on a pre-nehalem and a
> nehalem Intel system. All are running the latest patches to Solaris 10 u8.
> On Tue, Jun 14, 2011 at 6:32 PM, Patricia O'Reilly <oreilly@qualcomm.com
> <mailto:oreilly@qualcomm.com>> wrote:
>     Is this an x86 Solaris 10 box running on Nehalem?
>     Aaron Knister wrote:
>     > Good afternoon!
>     >
>     > I'm writing to report a deadlock issue I'm seeing on Solaris 10.
>     >
>     > What I've observed is that when a file larger than the configured size
>     > of the cache is copied out of AFS the cache manager deadlocks and all
>     > access to /afs on the affected system hangs until the system is
>     > rebooted. The issue occurs with a memory cache as well as a disk
>     cache.
>     >
>     > The issue can be mitigated if the cache size is raised to the value of
>     > roughly half of the physical memory in the given system. The issue
>     > appeared somewhere between Solaris 10 "u8" and "u9."
>     >
>     > I've reproduced the problem using OpenAFS, 1.5.78 and
>     1.6.0pre6
>     > and a Solaris 10 "u8" system with all of the latest patches applied.
>     >
>     > I've put together a tar file containing:
>     >
>     > - An fstrace dump starting a few seconds before I initiated the copy
>     > - A stack trace of the hung cp command
>     > - The output of cmdebug -long -server localhost run after AFS hangs
>     >
>     > The individual files as well as a tar file of them can be found here:
>     > http://userpages.umbc.edu/~aaronk/afs/solaris10-deadlock-issue.
>     >
>     > Any help would be greatly appreciated.
>     >
>     > Best,
>     > Aaron
>     >
>     > --
>     > Aaron Knister
>     > Systems Administrator
>     > Division of Information Technology
>     > University of Maryland, Baltimore County
>     > aaronk@umbc.edu <mailto:aaronk@umbc.edu> <mailto:aaronk@umbc.edu
>     <mailto:aaronk@umbc.edu>>
> -- 
> Aaron Knister
> Systems Administrator
> Division of Information Technology
> University of Maryland, Baltimore County
> aaronk@umbc.edu <mailto:aaronk@umbc.edu>