Automated 'continue on failure' deployment mode
Currently with Guided failure mode you need to watch the deployment and continue if a machine fails. Sometimes a deployment might be going out to 50 machines, and one might fail but the rest of the deployments should continue, and that deployment was scheduled for 1am.
Having a setting to allow you to set 'continue on failure' deployments so they would run without a user but would continue to the other machines even if X failed would take the place of the user always hitting continue.
I agree, having an option to continue on failure is needed.
For instance, with the recent Slack outage, all of our deploys were failing because it couldn't notify Slack
Mark Steller commented
I have a current situation where I have a directory cleanup step that I want to perform maintenance on as the last step in my deployment process (only keep the last 3 files). If this step fails for whatever reason its not a big deal and can be handled manually until the issue can be addressed/resolved.
Paul Lindhout commented
We've recently started using Octopus Deploy, but are very suprised this is not an option. We want to be able to deploy a new version of an application to all servers in an environment and it should fail only on the servers that do not meet the requirements. In our case an older version of the database that is not compatible with the new application.
Is there not even a workaround for this?
Richard Dominguez commented
I would really love this. Currently we have 50 machines that we deploy to. Our current architecture is a monolith, so one application handles our site (yes I know...). In any case, I run a single IIS deployment, then deployment validation (if succeeds release to our F5 production pool, if not disable from pool). In our scenario its ok if a few machines fail (could be a network glitch - does not occur very often), as long as the majority succeed.
Its now common to have constant smart validation of every machine, via healthchecks/diags/etc and have automation remove these machines from a production environment.
Pratik Surani commented
This feature is a must in my humble opinion. It should retry once of twice for the failure machine and then skip it for the rest of the steps.
Mike Jacobs commented
I would suggest that there be choice of both "Continue on failure" as well as an "Automatically retry attempts" setting (unset equals 1), so that a step that is trying to contact Slack, or something similar, which may work upon a retry, will give a few tries before failing and continuing without error.
If this feature gets added then it should be available to all steps including those that are within a rolling deployment process.
I would love to have this feature. We use some steps to post messages to slack during deployments. However when Slack is down for some reason, the entire deploy fails because it couldn't talk to Slack.
We really need this feature, we always do deployments at night, often to more than 300 computers at a time.
As there are this many computers there is always at least one that fails for whatever reason.
This would be acceptable for us if it would only affect this machine.
Now if one step fails on one computer all other computers aren't updated, which is of course unacceptable.
Yes please. Also for HipChat steps.
Matthew Morris commented
I would second having the option to mark a step as "optional". Like Gavin and Chris, Hipchat has been causing me a few annoyances today due to timeouts and whilst the deployment notifications are useful, I don't think they are worth failing the build.
Gavin Burke commented
As per Chris Maffin's post, my problem is also HipChat notification, randomly the API doesn't answer then the deployment fails and the last few Team City steps fail because of this too - marking a step as optional would allow all these deployments to continue whatever the outcome
Chris Maffin commented
I'd like to add to this, that I'd like to be able to set a step as "optional" in that it won't fail the deployment if it fails (ex. a HipChat success notification, if for whatever reason the HipChat server is unavailable, I want the deploy as a whole to not fail)