Migrate/Stop VMs with local storage when preparing host for maintenance#4212
Conversation
DaanHoogland
left a comment
There was a problem hiding this comment.
I see not much wrong with this code but am wondering whether and how this would impact all hypervisors. It was probably intended for KVM, knowing your business, so i trust that is fine. any marvin (so we can automate for other platforms)?
|
@DaanHoogland I did not answer your Marvin question, sorry for the delay. To be honest, I had no plans on creating Marvin tests for this. I can take a look at it, though. |
|
Ping for review @rhtyd, @nvazquez, @andrijapanicsb, @borisstoyanov, @DaanHoogland, @svenvogel, @weizhouapache, @RodrigoDLopez, @svenvogel, @kiwiflyer, and others :-) |
|
@GabrielBrascher what was the behaviour so far? I believe we were stopping all VMs, right? I would be happy to see the this new feature to implement the old behaviour as the defaul behaviour - to keep backward compatibility, if that makes sense |
|
@andrijapanicsb normally VMs are (live) migrated when a host is put into Maintenance; however, if the host has VMs with local storage the host is placed in the " Following the same concern that you raised, this implementation keeps the current behavior by default, therefore does not cause backward compatibility to any deployed Zone. To change the behavior the Root Admin needs to configure the global settings parameter This parameter defines the strategy towards VMs with volumes on local storage when putting a host in maintenance.
|
|
that sounds good @GabrielBrascher - thx for confirming. |
|
@andrijapanicsb I tested with KVM only indeed. I just re-checked the maintenance behavior with VMs running on shared storage. The zone holds one of the recent commits at the master (not older than one week). Tested on KVM node with NFS as primary storage:
|
wido
left a comment
There was a problem hiding this comment.
LGTM based on code and discussions with Gabriel
|
Code has been updated adding the option to Stop VM with force stop or a simple stop. Strategies are the following:
|
|
@blueorangutan package |
|
@GabrielBrascher a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
|
Packaging result: ✔centos7 ✔centos8 ✔debian. JID-2852 |
|
Running tests. The strategy added for Stop is failing; StopForce works fine. |
|
@blueorangutan test keepEnv |
|
@DaanHoogland a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests |
|
Trillian test result (tid-3639)
|
|
@GabrielBrascher sounds like you are not ready to have this merged, do you? |
|
Code has been updated reverting the change to allow choosing between Stop or ForceStop. It turns out that the implementation got a bit tricky when HA management calls a VM Fitting the "graceful" Stop into the current maintenance workflow might be a good enhancement for another PR, but for now, keeping the strategy feature as it is:
Manual tests are looking good. |
|
@blueorangutan package |
|
@GabrielBrascher a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
|
Packaging result: ✔centos7 ✔centos8 ✔debian. JID-2862 |
|
@blueorangutan package |
|
@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
|
Packaging result: ✖centos7 ✖centos8 ✖debian. JID-2875 |
|
@blueorangutan package |
|
@GabrielBrascher a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. [S] |
|
Packaging result: ✔️ centos7 ✔️ centos8 ✔️ debian. SL-JID 94 |
|
[S] Trillian test result (tid-102)
|
|
Considering that this implementation only touches at Thanks to all the reviewers @DaanHoogland @andrijapanicsb @rhtyd @wido @RodrigoDLopez; this one looks ready for merge. Are there any concerns that are missing my attention? |
Description
This PR adds a global settings parameter that configures the strategy for handling VMs in local storage when putting a host in maintenance.
Global settings name:
host.maintenance.local.storage.strategyDescription: Defines the strategy towards VMs with volumes on local storage when putting a host in maintenance. The default strategy is 'Error', preventing maintenance in such a case. To migrate away VMs running on local storage choose 'Migrating' strategy. To stop VMs, choose 'Stopping' strategy.
Types of changes
How Has This Been Tested?