Hi there,
since some weeks we are experiencing increasingly issues with stucking deployments.
Not always but often this issue occures while new deployments, but more often while reprovision a VM. The affected VM is running and ready for login but stays in state "Finalized" instead of "On".
Therefore all Day2-Actions except of "Destroy" are inaccessable for enduser. We have a workaround Workflow, where a user can select his stucking VM and reset the status to "On" in Iaas DB. But this is not a solution, because this behaviour occures at about 80% of every reprovion task.
Does anyone have similar issues?
Attention: These Logs are upside down because they are exported from Log Insight. The newest lines are on top.
Here a badtrace from all.log ->
ATDEVCICPWS0031: Workflow upgrade completed
ATDEVCICPWS0031: VirtualMachineTracking: workflowId=[59ab1756-3c37-4714-a1dc-ce266bf4e0ca], state=[Finalized] is being upgraded, state plugins will not be called
ATDEVCICPWS0031: Not executing external workflows or any activities in state Requested while workflow is being upgraded.
ATDEVCICPWS0031: Not executing external workflows or any activities in state Requested while workflow is being upgraded.
ATDEVCICPWS0031: Waiting for workflow upgrade to complete ...
ATDEVCICPWS0031: VirtualMachineTracking: workflowId=[59ab1756-3c37-4714-a1dc-ce266bf4e0ca], state=[Requested] is being upgraded, state plugins will not be called
ATDEVCICPWS0031: Not executing external workflows or any activities in state Requested while workflow is being upgraded.
ATDEVCICPWS0031: Not executing external workflows or any activities in state Requested while workflow is being upgraded.
token="zzEvD0ay"] GetWorkItem AgentID:ESXVCSA12Q taskType: PowerOn Long Running: False Data Collection: False Can run long running data collection: True Can run long ru
token="cUQ9h25J"] Machine 59ab1756-3c37-4714-a1dc-ce266bf4e0ca: Name: ATDEVCICPWS0031, State: Finalized.
oken="cUQ9h25J"] ATDEVCICPWS0031: [EventBrokerService] Resuming workflow with id 59ab1756-3c37-4714-a1dc-ce266bf4e0ca on queue 6061669e-a75d-427a-b955-a85cd0dbcced. Sta
token="cUQ9h25J"] [EML001] ATDEVCICPWS0031: Email template file not found [C:\Program Files (x86)\VMware\vCAC\Server\Templates\ManagerPowerOn.xslt] for template [Manage
token="cUQ9h25J"] ATDEVCICPWS0031: Sent email to ----using template PowerOn
and here a snippet from good trace ->
Virtual Machine ATDEVCICPWS0031 is on
ATDEVCICPWS0031: [EventBrokerService] Resuming workflow with id 59ab1756-3c37-4714-a1dc-ce266bf4e0ca on queue a572343e-7232-436a-b84b-14766a0831a2. State
Machine 59ab1756-3c37-4714-a1dc-ce266bf4e0ca: Name: ATDEVCICPWS0031, State: On.
ATDEVCICPWS0031: sendEBSMessage56(workflow=59ab1756-3c37-4714-a1dc-ce266bf4e0ca) Sent event with id 5f6ec210-f13e-11e7-560a-31c639da8969
ATDEVCICPWS0031: sendEBSMessage56(workflow=59ab1756-3c37-4714-a1dc-ce266bf4e0ca) About to send an event with timeout (00:30:00) for state VMPSMasterWork
ATDEVCICPWS0031: sendEBSMessage56(workflow=59ab1756-3c37-4714-a1dc-ce266bf4e0ca) for state VMPSMasterWorkflow32.On phase PRE event
ATDEVCICPWS0031: [EventBrokerService] Resuming workflow with id 59ab1756-3c37-4714-a1dc-ce266bf4e0ca on queue 2de0d7f0-2069-4bc4-844b-ecdcf3d0a2ef. Stat
ATDEVCICPWS0031: sendEBSMessage55(workflow=59ab1756-3c37-4714-a1dc-ce266bf4e0ca) Sent event with id 5bc19890-f13e-11e7-560a-31c639da8969
ATDEVCICPWS0031: sendEBSMessage55(workflow=59ab1756-3c37-4714-a1dc-ce266bf4e0ca) About to send an event with timeout (00:30:00) for state VMPSMasterWorkf
ATDEVCICPWS0031: sendEBSMessage55(workflow=59ab1756-3c37-4714-a1dc-ce266bf4e0ca) for state VMPSMasterWorkflow32.TurningOn phase POST event
ATDEVCICPWS0031: [EventBrokerService] Resuming workflow with id 59ab1756-3c37-4714-a1dc-ce266bf4e0ca on queue 12706aac-e8c4-4eb5-ae6e-e541aeb56e82. Stat
[EML001] ATDEVCICPWS0031: Email template file not found [C:\Program Files (x86)\VMware\vCAC\Server\Templates\ManagerPowerOn.xslt] for template [ManagerP
ATDEVCICPWS0031: Sent email to ----- using template PowerOn
These are the last lines from log concerning this specific VM.
It seems to me IaaS is waiting for something. But between "PowerOn" and "PowerOn Long Running" are only a few seconds in logfile