[OpenAFS] Re: undpInOverflows on heavily loaded AFS clients (Solaris 10 X86, SunRay server)

Thu, 20 Apr 2017 19:04:34 +0200

This is a multi-part message in MIME format.
--------------8F41AE0FCF04033B00F5E1BD
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit

Deeper searching brought us to a Solaris 10 source from 2005 (?) where 
the network stack looked like the stack trace from our dtrace script.
With the information found there it was possible to further examine the 
reason for dropping the packets: It turns out, that the extended buffer 
length from our ndd command is not used for this particular socket 
connection.
Which implies the udpBufSize for the socket is set explicitly probably 
by openAFS itself.

Examining the AFS source for the occurrence of SO_RCVBUF setsockopt() 
yields following c code

    len = rx_UdpBufSize;

         error = sockfs_sosetsockopt(so, SOL_SOCKET, SO_RCVBUF, &len,
    sizeof(len));

Where rx_UdpBufSize turns out to be a constant defined in a global 
.h-file having the value 64*1024

We are presently working on modifying this constant setting and testing 
if this helps.

On 19.04.17 19:26, Karl Behler wrote:
>
> Dear All,
>
> I today posted following question in the Oracle Solaris Community network.
> However, I'd like also to hear your opinion or hints about it.
>
> In fact under high load (30+ users logged in into one RayServer) we 
> are occasionally experiencing hangs of several seconds up to minutes 
> for the users desktops. However, this is not a continuously growing 
> degradation but more like hitting a wall (nothing goes) and then 
> falling back as if nothing had happened. But the frequency of these 
> events is going up over the week and typically on afternoons 
> occasionally systems may come to a complete and sometimes 
> unrecoverable halt (needing a hard reboot).
>
>> We are using openAFS as network filesystem for home directories and 
>> data files on our SunRay servers. As the number of users sessions 
>> (20-40) and the activity (5-10% CPU usage) rise on these SunRay 
>> servers we observe alarming values from the command "netstat -s".
>>
>> IPv4    ipForwarding        =     2     ipDefaultTTL =   255
>>         ipInReceives        =123501543  ipInHdrErrors =     0
>>         ipInAddrErrors      =     0     ipInCksumErrs =     0
>>         ipForwDatagrams     =     0     ipForwProhibits =    19
>>         ipInUnknownProtos   =  3361     ipInDiscards =     6
>>         ipInDelivers        =123975114  ipOutRequests =114328336
>> ...
>>         tcpInErrs           =     0     udpNoPorts =  3248
>>         udpInCksumErrs      =     0     udpInOverflows =  1848
>> ...
>>
>> As the protocol used between the openAFS file servers and the openAFS 
>> clients on the SunRay servers is UDP based, we used rxdebug for 
>> further insight. We are seeing resends and waiting calls.
>>
>> Using dtrace to analyze what kind of packets are dropped one clearly 
>> (and only) sees packets sent from openAFS fileservers 
>> x.x.[100.59,100.67,30.41] to the SunRay server client on x.x.100.129:
>>
>>     13   5093 ip_udp_input:udpIfStatsInOverflows src: x.x.100.59:7000
>>     dst: x.x.100.129:7001 count: 1
>>     13   5093 ip_udp_input:udpIfStatsInOverflows src: x.x.100.59:7000
>>     dst: x.x.100.129:7001 count: 1
>>     13   5093 ip_udp_input:udpIfStatsInOverflows src: x.x.100.59:7000
>>     dst: x.x.100.129:7001 count: 1
>>     17   5093 ip_udp_input:udpIfStatsInOverflows src: x.x.100.67:7000
>>     dst: x.x.100.129:7001 count: 1
>>     13   5093 ip_udp_input:udpIfStatsInOverflows src: x.x.100.59:7000
>>     dst: x.x.100.129:7001 count: 1
>>     13   5093 ip_udp_input:udpIfStatsInOverflows src: x.x.100.59:7000
>>     dst: x.x.100.129:7001 count: 1
>>     35   5093 ip_udp_input:udpIfStatsInOverflows src: x.x. 30.41:7000
>>     dst: x.x.100.129:7001 count: 1
>>
>> According to knowledgebase document "netstat -s : Information and 
>> notes (Doc ID 1010792.1 
>> <https://support.oracle.com/rs?type=doc&id=1010792.1>)" it is 
>> recommended to increase the udp_recv_hiwat parameter.
>>
>> However, using "ndd set /dev/udp udp_recv_hiwat 134217728" (128 MB) 
>> does not reduce the udpInOverflows.
>>
>> Further research using dtrace delivers evidence that UDP packets are 
>> discarded early in the IP stack.
>>
>> dtrace -n  'mib:::udpIfStatsInOverflows{stack();}'
>>
>> 12   4648 ip_udp_input:udpIfStatsInOverflows
>>               ip`ip_input+0xcb2
>>               dls`soft_ring_drain+0x93
>>               dls`soft_ring_worker+0xdb
>>               unix`thread_start+0x8
>>
>> However, for us here ends the debug possibility and the insight 
>> what's going on.
>>
>> This all happens on an Oracle Blade X4-2B (blade center 6000) under 
>> SunOS 5.10 Generic_150401-48 with SunRay server software 
>> 4.5.4_34,REV=2015.04.14.10.39. Two 10Gb network interfaces are used.
>>
>> Any hint how one could proceed analyzing the situation (e.g. 
>> understanding the stack) or trying further system tuning parameters 
>> is very much appreciated.
>>
> I'm including the netstat document from the Oracle knowledge base 
> since it can not be found by a web search.
>
> Best regards,
>
> Karl
>
>
> -- 
> Dr. Karl Behler	
> CODAC & IT services ASDEX Upgrade
> phon +49 89 3299-1351 fax 3299-961351
>

-- 
Dr. Karl Behler	
CODAC & IT services ASDEX Upgrade
phon +49 89 3299-1351 fax 3299-961351

--------------8F41AE0FCF04033B00F5E1BD
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: 8bit

<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">
      <p style="padding-left: 30px;">Deeper searching brought us to a
        Solaris 10 source from 2005 (?) where the network stack looked
        like the stack trace from our dtrace script.<br>
        With the information found there it was possible to further
        examine the reason for dropping the packets: It turns out, that
        the extended buffer length from our ndd command is not used for
        this particular socket connection.<br>
        Which implies the udpBufSize for the socket is set explicitly
        probably by openAFS itself.</p>
      <p style="padding-left: 30px;">Examining the AFS source for the
        occurrence of <span class="kmContent" style="font-size: small;
          color: black;">SO_RCVBUF setsockopt() yields following c
          code     </span></p>
      <blockquote class="jive-quote">
        <p>len = rx_UdpBufSize;</p>
        <p>    error = sockfs_sosetsockopt(so, SOL_SOCKET, SO_RCVBUF,
          &amp;len, sizeof(len));</p>
      </blockquote>
      <p style="padding-left: 30px;">Where rx_UdpBufSize turns out to be
        a constant defined in a global .h-file having the value 64*1024</p>
      <p style="padding-left: 30px;">We are presently working on
        modifying this constant setting and testing if this helps.</p>
      <br>
      <br>
      On 19.04.17 19:26, Karl Behler wrote:<br>
    </div>
    <blockquote
      cite="mid:48d14123-da00-d652-8efb-93ef9a0bc3c2@ipp.mpg.de"
      type="cite">
      <meta http-equiv="content-type" content="text/html; charset=utf-8">
      <p>Dear All,</p>
      <p>I today posted following question in the Oracle Solaris
        Community network.<br>
        However, I'd like also to hear your opinion or hints about it.<br>
      </p>
      <p>In fact under high load (30+ users logged in into one
        RayServer) we are occasionally experiencing hangs of several
        seconds up to minutes for the users desktops. However, this is
        not a continuously growing degradation but more like hitting a
        wall (nothing goes) and then falling back as if nothing had
        happened. But the frequency of these events is going up over the
        week and typically on afternoons occasionally systems may come
        to a complete and sometimes unrecoverable halt (needing a hard
        reboot). <br>
      </p>
      <blockquote type="cite">
        <div class="jive-rendered-content">
          <p>We are using openAFS as network filesystem for home
            directories and data files on our SunRay servers. As the
            number of users sessions (20-40) and the activity (5-10% CPU
            usage) rise on these SunRay servers we observe alarming
            values from the command "netstat -s".</p>
          <p>IPv4    ipForwarding        =     2     ipDefaultTTL       
            =   255<br>
                    ipInReceives        =123501543  ipInHdrErrors      
            =     0<br>
                    ipInAddrErrors      =     0     ipInCksumErrs      
            =     0<br>
                    ipForwDatagrams     =     0     ipForwProhibits    
            =    19<br>
                    ipInUnknownProtos   =  3361     ipInDiscards       
            =     6<br>
                    ipInDelivers        =123975114  ipOutRequests      
            =114328336<br>
            ...     <br>
                    tcpInErrs           =     0     udpNoPorts         
            =  3248<br>
                    udpInCksumErrs      =     0     udpInOverflows     
            =  1848<br>
            ...</p>
          <p>As the protocol used between the openAFS file servers and
            the openAFS clients on the SunRay servers is UDP based, we
            used rxdebug for further insight. We are seeing resends and
            waiting calls.</p>
          <p>Using dtrace to analyze what kind of packets are dropped
            one clearly (and only) sees packets sent from openAFS
            fileservers x.x.[100.59,100.67,30.41] to the SunRay server
            client on <tt><tt>x.x.</tt>100.129:</tt></p>
          <blockquote class="jive-quote"><tt> 13   5093
              ip_udp_input:udpIfStatsInOverflows src: x.x.100.59:7000
              dst: <tt>x.x.</tt>100.129:7001 count: 1</tt><tt><br>
            </tt><tt>13   5093 ip_udp_input:udpIfStatsInOverflows src:
              x.x.100.59:7000 dst: <tt>x.x.</tt>100.129:7001 count: 1</tt><tt><br>
            </tt><tt>13   5093 ip_udp_input:udpIfStatsInOverflows src: <tt>x.x.</tt>100.59:7000
              dst: <tt>x.x.</tt>100.129:7001 count: 1</tt><tt><br>
            </tt><tt>17   5093 ip_udp_input:udpIfStatsInOverflows src: <tt>x.x.</tt>100.67:7000
              dst: <tt>x.x.</tt>100.129:7001 count: 1</tt><tt><br>
            </tt><tt>13   5093 ip_udp_input:udpIfStatsInOverflows src: <tt>x.x.</tt>100.59:7000
              dst: <tt>x.x.</tt>100.129:7001 count: 1</tt><tt><br>
            </tt><tt>13   5093 ip_udp_input:udpIfStatsInOverflows src: <tt>x.x.</tt>100.59:7000
              dst: <tt>x.x.</tt>100.129:7001 count: 1</tt><tt><br>
            </tt><tt>35   5093 ip_udp_input:udpIfStatsInOverflows src: <tt>x.x.</tt>
              30.41:7000 dst: <tt>x.x.</tt>100.129:7001 count: 1</tt></blockquote>
          <p><span><span>According to knowledgebase document "netstat -s
                : Information and notes (</span><a
                moz-do-not-send="true" class="oracl-km-link
                oracle-km-link-note"
                href="https://support.oracle.com/rs?type=doc&amp;id=1010792.1"
                target="_blank">Doc ID 1010792.1</a><span>)" it is
                recommended to increase the udp_recv_hiwat parameter.</span></span></p>
          <p>However, using "ndd set /dev/udp udp_recv_hiwat 134217728"
            (128 MB) does not reduce the udpInOverflows.</p>
          <p>Further research using dtrace delivers evidence that UDP
            packets are discarded early in the IP stack.</p>
          <p>dtrace -n  'mib:::udpIfStatsInOverflows{stack();}'</p>
          <p>12   4648 ip_udp_input:udpIfStatsInOverflows<br>
                          ip`ip_input+0xcb2<br>
                          dls`soft_ring_drain+0x93<br>
                          dls`soft_ring_worker+0xdb<br>
                          unix`thread_start+0x8 </p>
          <p style="min-height: 8pt; padding: 0px;"> </p>
          <p>However, for us here ends the debug possibility and the
            insight what's going on.</p>
          <p>This all happens on an Oracle Blade X4-2B (blade center
            6000) under SunOS 5.10 Generic_150401-48 with SunRay server
            software 4.5.4_34,REV=2015.04.14.10.39. Two 10Gb network
            interfaces are used.</p>
          <p>Any hint how one could proceed analyzing the situation
            (e.g. understanding the stack) or trying further system
            tuning parameters is very much appreciated.</p>
        </div>
      </blockquote>
      I'm including the netstat document from the Oracle knowledge base
      since it can not be found by a web search.
      <p>Best regards,</p>
      <p>Karl<br>
      </p>
      <br>
      <pre class="moz-signature" cols="72">-- 
Dr. Karl Behler	
CODAC &amp; IT services ASDEX Upgrade
phon +49 89 3299-1351 fax 3299-961351

</pre>
    </blockquote>
    <br>
    <p><br>
    </p>
    <pre class="moz-signature" cols="72">-- 
Dr. Karl Behler	
CODAC &amp; IT services ASDEX Upgrade
phon +49 89 3299-1351 fax 3299-961351

</pre>
  </body>
</html>

--------------8F41AE0FCF04033B00F5E1BD--