[OpenAFS] Craziness with cache, Input/Output error, Linux

Thu, 31 Jan 2008 12:43:13 -0500

------=_Part_18953_13151964.1201801393393
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

On Jan 31, 2008 12:10 PM, Jeff Blaine <jblaine@kickflop.net> wrote:

> Yes, /cache is an ext3 filesystem.
>
>
Going back over zephyr logs, I find what Kris was doing. He said:

In essence, it seems to be possible that as AFS is flushing data to the
server, and deletes the content of cache blocks, it tries to write new data
immediately after deleting old data, and that the delete has not yet been
committed to disk.  In that case, the data blocks taken up by the deleted
data are not released for reallocation yet, and thus while AFS thinks there
is enough room left in the cache, there actually isn't.
Note that if e.g. jbd-debug is enabled (in my case, at a level of 2 or
more),
enough extra work is being done by the kernel that the kjournald processing
actually manages to do the commits before AFS demands the space, and thus it
won't trigger the ENOSPC condition.
How is that for a nice issue :)

[...]

Good news on that one is that Chas believes that just setting
S_SYNC on the inode before truncation is probably a good thing on *any*
filesystem on Linux, not just ext3, making the patch trivial.  Testing
scheduled for that (and potential impact, if any) later today.

February 2005 timeframe. I suppose I should go look in RT; A look at
osi_file.c on Linux yields only
+    filp->f_mode = FMODE_READ|FMODE_WRITE;

in rev 1.24, which is probably related.

------=_Part_18953_13151964.1201801393393
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

<br><br><div class="gmail_quote">On Jan 31, 2008 12:10 PM, Jeff Blaine &lt;<a href="mailto:jblaine@kickflop.net">jblaine@kickflop.net</a>&gt; wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Yes, /cache is an ext3 filesystem.<br><br></blockquote></div><br>Going back over zephyr logs, I find what Kris was doing. He said:<br><br>In essence, it seems to be possible that as AFS is flushing data to the<br>server, and deletes the content of cache blocks, it tries to write new data<br>
immediately after deleting old data, and that the delete has not yet been<br>committed to disk.&nbsp; In that case, the data blocks taken up by the deleted<br>data are not released for reallocation yet, and thus while AFS thinks there<br>
is enough room left in the cache, there actually isn&#39;t.<br>Note that if e.g. jbd-debug is enabled (in my case, at a level of 2 or more),<br>enough extra work is being done by the kernel that the kjournald processing<br>
actually manages to do the commits before AFS demands the space, and thus it<br>won&#39;t trigger the ENOSPC condition.<br>How is that for a nice issue :)<br><br>[...]<br><br>Good news on that one is that Chas believes that just setting<br>
S_SYNC on the inode before truncation is probably a good thing on *any*<br>filesystem on Linux, not just ext3, making the patch trivial.&nbsp; Testing<br>scheduled for that (and potential impact, if any) later today.<br><br>February 2005 timeframe. I suppose I should go look in RT; A look at osi_file.c on Linux yields only <br>
+&nbsp;&nbsp;&nbsp; filp-&gt;f_mode = FMODE_READ|FMODE_WRITE;<br><br>in rev 1.24, which is probably related.<br><br><br>

------=_Part_18953_13151964.1201801393393--