Clustering cinc or chef Part 1
This article is the first of a series about setting up Cinc or Chef as as a horizontally scalable service. This first post introduces the concept of which parts of the service need to be broken out to provide for a cluster.
Chef Cinc is surprisingly simple to set up, especially considering
the amount of underlying complexity that is hidden from the user.
# curl -L https://omnitruck.cinc.sh/install.sh | sudo bash -s -- -P cinc-server -v 14
# cinc-server-ctl reconfigure
Sure enough, we have a brand new Chef Cinc server, ready to manage
hundreds, or evne thousands of clients. It almost feels too easy, doesn’t
it?
Let’s take a look at what we have..
From our perspective, it’s just a Chef Cinc server, taking and
serving request. Under the blankets, though, there’s a bit more to it.
Things, at least at first, are just fine!
As time passes, it becomes more clear that our entire infrastructure is
dependent upon that server never, ever going down. Something as simple as a
kernel upgrade on your Chef Cinc server often causes the entire
organization to become rudderless. Your organization loses the ability to scale
new app servers, becuase chef is down. SecOps become rudderless, as they lose
the ability to patch zero days, because the chef cinc server is down.
Tools that rely upon inventory management lose the ability to watch systems.
It can get ugly. Its our responsibility to avoid single points of failure like
these.
There is thankfully a process by which we can run as many chef cinc servers
as we want! The api server itself is stateless and we can run as many of them
as we want, as long as we externalize the Postgres Database and Opensearch
Cluster. The rest of the stuff, nginx, reddis, rabbitmq can stay right
where it is.
Things are slightly more complicated then the above chart, but not by very much. Firstly, redundancy for the database server must be addressed, either by using Amazon RDS in multi-az mode (which will handle all failover for you), or manually setting up some sort of replication and failover process with the database server.
Setting up redundancy with opensearch is less important, as the data can be regenerated at any time by running “cinc-server-ctl reindex” on any of the cinc api servers. That said, setting up replication for Opensearch is rather easy and should be done if you have the money to cover the cost three systems.
RDS and amazon OpenSearch can handle that redundancy for you, but if you want to build a fully redundant architecture build-it-my-own-self style, you’re looking at the following:
In our next article, we will cover the basic configuration changes needed to externalize the Database and Search engine.