vCloud vApp fail to start with error "Edge deployment failed on host as the message bus infra on the host is not green"
Recently, we faced an issue when starting a vApp in the org due to an error " Edge deployment failed on host host-33 as the message bus infra on the host is not green. Please call API to re-sync the message bus and after successful re-sync, try edge installation., error code 10921"
This issue is completely from the ESXi / NSX end and there is no issue in the vCloud director. But this issue severely affects the power-on opeation for the Tenants in the vcloud director.
Error stack :
++++++++++++++++++++++++++++++++++++++++++++++++++++
[ 3455c8b1-d86f-4ee1-b982-d99f2737a5a3 ] Internal Server Error
- java.util.concurrent.ExecutionException: com.vmware.ssdc.util.LMException: Unable to start vApp "Test-01".
- com.vmware.ssdc.util.LMException: Unable to start vApp "Test-01".
- Unable to start vApp "Test-01".
- Unable to deploy network "Fence(urn:uuid:577df80b-f352-45f3-9739-8819fae1a02a)".
com.vmware.vcloud.common.network.VsmException: Edge deployment failed on host host-33 as the message bus infra on the host is not green. Please call API to re-sync the message bus and after successful re-sync, try edge installation., error code 10921
- com.vmware.vcloud.common.network.VsmException: Edge deployment failed on host host-33 as the message bus infra on the host is not green. Please call API to re-sync the message bus and after successful re-sync, try edge installation., error code 10921
- Edge deployment failed on host host-33 as the message bus infra on the host is not green. Please call API to re-sync the message bus and after successful re-sync, try edge installation., error code 10921
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
The error clearly indicates that there was a problem with the message bus infra agent (vsfwd) on the esxi host on which the NSX edge for the vApp was getting deployed. It is must that the Message Bus Infra on the host should be up and running on the esxi for NSX Edge deployment.
To know the status of firewall agent ( vsfwd-message bus) , Go to Networking & Security --> Installtion--> Click on the cluster --> Actions --> Communication channel health,
Login into the host which is showing the firewall agen is down and run the following commad to make sure the firewall agent is down,
Verify if VSFWD is configured and the output should display the NSX Manager IP address.
Affcted host:
[root@Test-esx85:/var/log] esxcfg-advcfg -g /UserVars/RmqIpAddress
Value of RmqIpAddress is unset
Working host:
[root@Test-esx45:~] esxcfg-advcfg -g /UserVars/RmqIpAddress
Value of RmqIpAddress is 192.168.110.1 [NSX manager IP]
Verify the VSFWD connectivity to the RabbitMQ broker. Run this command on ESXi hosts to see a list of connections from the vsfwd process on the ESXi host to the NSX Manager.
Affcted host:
[root@Test-esx85:/var/log] esxcli network ip connection list | grep 5671
[NO OUTPUT]
Working host:
[root@Test-esx45:/var/log] esxcli network ip connection list | grep 5671
tcp 0 0 192.168.110.51:29969 192.168.110.15:5671 ESTABLISHED 35505 newreno vsfwd
tcp 0 0 192.168.110.51:29968 192.168.110.15:5671 ESTABLISHED 35505 newreno vsfwd
There should be at least two connections on port 5671.But there can be more connections on port 5671 as there are NSX Edge virtual machines deployed on the ESXi host which also establish connections to the RMQ broker.
And also you can get the list of host which has issues from the NSX manager Dashboard under Firewall Publish Status
There are two solutions for this issue,
1) Run the message bus sync API call for the cluster using the API POST https://<NSX-IP>/api/2.0/nwfabric/configure?action=synchronize
Request Body:
<nwFabricFeatureConfig>
<featureId>com.vmware.vshield.vsm.messagingInfra</featureId>
<resourceConfig>
<resourceId>{HOST/CLUSTER MOID}</resourceId>
</resourceConfig>
</nwFabricFeatureConfig>
2) Collect the support bundle from the ESXi host / NSXM/ vCenter server and Reboot the ESXi host which shows the firewall agent down.after disabling it in the vCloud Director.
After fixing the message bus validate the status using, Networking & Security --> Installtion--> Click on the cluster --> Actions --> Communication channel health,
Refer : https://kb.vmware.com/s/article/2133897 (Understanding and troubleshooting Message Bus)
Hello I am so delighted I located your blog, I really located you by mistake, while I was watching on google for something else, Anyways I am here now and could just like to say thank for a tremendous post and a all round entertaining website. Please do keep up the great work. error 502 bad gateway fix
ReplyDelete