AYON's failover mechanism during server outage

Hullo,

I’m making this post to ask about maintaining a self hosted AYON Sever.

AYON client-server architecture is very beneficial for propagating information from a single source of truth.

However, stabilizing the server and ensuring continuous uptime become critically important. When the server is down, nobody can work. This is a big risk especially during production since the studios are usually on a tight deadlines.

Despite of our best effort, unforeseen outages can occurs, which puts significant stress on the IT department as they are responsible for maintaining operations. Any downtime is not a good thing as it can critically disrupt production. In this situation, AYON server become a single point of failure that can potentially halt operation during production.

As I’m relatively new to AYON, I would greatly appreciate any guidance on the following:

1. What are the failover mechanism that can be used to maintain AYON backend availability during service outage?

Our initial though on this question is to deploy a PostgreSQL cluster to protect the DB and if the main AYON server is down, we can quickly switch to our backup one that connect to the same cluster. However, I’m not entirely sure if this would cover all the necessary components for normal operation.

2. Can AYON Client have a local cache and offline mode (with limited functionality such as launching DCC and increment save) for when the AYON backend is unavailable?

Implementing an offline mode is a challenge as there each DCC interacts with AYON in different ways. However, I believe it would be extremely useful if AYON can have a mechanism for allowing users to perform essential tasks such as launching DCC. This would enable people to continue working in a limited capacity, while the tech team is fixing the issue with the server.

I’d really appreciate any insights or experiences the community can share regarding these challenges. Your feedback will be incredibly helpful as we work toward improving AYON’s resilience and ensuring smooth operations during production.

Thank you in advance for your input!

Hey,
Could please make it more explicit at the top of your page that it’s related to self hosted AYON Server?

Also, I believe this topic is much similar to

1 Like