http://wiki.lustre.org/index.php?title=Imperative_Recovery&feed=atom&action=historyImperative Recovery - Revision history2024-03-28T11:08:57ZRevision history for this page on the wikiMediaWiki 1.39.3http://wiki.lustre.org/index.php?title=Imperative_Recovery&diff=3752&oldid=prevAdilger: /* Testing */2020-02-28T09:08:47Z<p><span dir="auto"><span class="autocomment">Testing</span></span></p>
<table style="background-color: #fff; color: #202122;" data-mw="interface">
<col class="diff-marker" />
<col class="diff-content" />
<col class="diff-marker" />
<col class="diff-content" />
<tr class="diff-title" lang="en">
<td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">← Older revision</td>
<td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">Revision as of 02:08, 28 February 2020</td>
</tr><tr><td colspan="2" class="diff-lineno" id="mw-diff-left-l15">Line 15:</td>
<td colspan="2" class="diff-lineno">Line 15:</td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>=Testing=</div></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>=Testing=</div></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td></tr>
<tr><td class="diff-marker" data-marker="−"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>Scalability is very important for <del style="font-weight: bold; text-decoration: none;">imperative recovery </del>because we need the capability to support at least 100K clients. Based on this situation, instead of regular unit and regression test, we also have a scalability test document to verify it works. Please check out the [[Imperative Recovery Test Plan]].</div></td><td class="diff-marker" data-marker="+"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>Scalability is very important for <ins style="font-weight: bold; text-decoration: none;">Imperative Recovery </ins>because we need the capability to support at least 100K clients. Based on this situation, instead of regular unit and regression test, we also have a scalability test document to verify it works. Please check out the [[Imperative Recovery Test Plan]].</div></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>=Presentation=</div></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>=Presentation=</div></td></tr>
<!-- diff cache key wiki_lustre_org:diff::1.12:old-3751:rev-3752 -->
</table>Adilgerhttp://wiki.lustre.org/index.php?title=Imperative_Recovery&diff=3751&oldid=prevAdilger: /* High Level Design */ add link to Imperative Recovery hlD2020-02-28T09:08:22Z<p><span dir="auto"><span class="autocomment">High Level Design: </span> add link to Imperative Recovery hlD</span></p>
<table style="background-color: #fff; color: #202122;" data-mw="interface">
<col class="diff-marker" />
<col class="diff-content" />
<col class="diff-marker" />
<col class="diff-content" />
<tr class="diff-title" lang="en">
<td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">← Older revision</td>
<td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">Revision as of 02:08, 28 February 2020</td>
</tr><tr><td colspan="2" class="diff-lineno" id="mw-diff-left-l11">Line 11:</td>
<td colspan="2" class="diff-lineno">Line 11:</td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>=High Level Design=</div></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>=High Level Design=</div></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td></tr>
<tr><td class="diff-marker" data-marker="−"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>The High Level Design is described in [[Imperative Recovery <del style="font-weight: bold; text-decoration: none;">HLD</del>]].</div></td><td class="diff-marker" data-marker="+"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>The High Level Design is described in <ins style="font-weight: bold; text-decoration: none;">the </ins>[[<ins style="font-weight: bold; text-decoration: none;">media:Imperative_Recovery_Design.pdf|</ins>Imperative Recovery <ins style="font-weight: bold; text-decoration: none;">Design</ins>]].</div></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>=Testing=</div></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>=Testing=</div></td></tr>
<!-- diff cache key wiki_lustre_org:diff::1.12:old-3277:rev-3751 -->
</table>Adilgerhttp://wiki.lustre.org/index.php?title=Imperative_Recovery&diff=3277&oldid=prevAdilger: Initial import from https://wiki.hpdd.intel.com/display/PUB/Imperative+Recovery2018-06-09T00:03:56Z<p>Initial import from https://wiki.hpdd.intel.com/display/PUB/Imperative+Recovery</p>
<p><b>New page</b></p><div>=Introduction=<br />
<br />
If a Lustre server fails, the recovery mechanism on the client will first detect this when an RPC is not replied within the expected processing time. At this point, the client will first try to resend the RPC to the server, but if that fails it will enter recovery, and try reconnecting to all of the servers that are configured for that target (OST or MDT). To avoid flooding the network with reconnect requests, the client reconnection RPCs will only be sent periodically.<br />
<br />
Imperative recovery reduces the time for clients to detect that targets have restarted after failure, so that they can do reconnection as soon as possible. The MGS is used as a reflector and it will notify clients when a target is newly registered.<br />
<br />
This picture demonstrates the rough idea of the communication process for Imperative Recovery:<br />
<br />
[[File:imperative-recovery.jpg]]<br />
<br />
=High Level Design=<br />
<br />
The High Level Design is described in [[Imperative Recovery HLD]].<br />
<br />
=Testing=<br />
<br />
Scalability is very important for imperative recovery because we need the capability to support at least 100K clients. Based on this situation, instead of regular unit and regression test, we also have a scalability test document to verify it works. Please check out the [[Imperative Recovery Test Plan]].<br />
<br />
=Presentation=<br />
<br />
There is a presentation about imperative recovery at [[Lustre_User_Group_2011|LUG 2011]], please take a look at [[Media:LUG-2011-Jinshan_Xiong-Imperative_Recovery.pdf|the presentation]] and [https://www.youtube.com/watch?v=DnGo4ntS6WA the video].<br />
<br />
=Results=<br />
<br />
In our simulation test on Hyperion, a large test cluster at LLNL with 125 client nodes and 600 mountpoints on each client, we killed an OSS and investigated OST recovery time. It could finish recovery in around 60 seconds. As a comparison, it took about 300 seconds without imperative recovery.<br />
<br />
[[Category:Architecture]]</div>Adilger