The most overkill, insane, and over-the-top server configuration that would not look out of place in a large company's datacenter. The stuff that dreams of made of. And I'm here to show you all how to do it.

Here's the general overview of the architechture of our system: we are going to have multiple layers to the system that will make it as managable as possible. There will be three main layers: the hardware layer, where the actual hardware appliances will be running; the Infrastructure layer, where we manage the hardware, networking, and storage; and the Platform layer, where we manage applications and pipelines.

Let's start on the "hardware" layer. This will include the "Undercloud" hardware (more on that later), networking (firewall, routers, and switches), and the "Overcloud" hardware (more on that later).

Hardware

In order for an enterprise-grade homelab to work, we need some hardware. There will be three main types of machines: storage nodes, compute nodes, and control nodes. If you wish, you can add a dedicated networking node (or three). Let's start with server specs:

Undercloud Node

  • Don’t need to be super powerful, but need to be redundant
  • Backbone of infrastructure
  • Need decent amount of storage (for server images)
  • Fast SSD preferred (PXE)
  • Hardware TBD
  • IPMI port and multiple networking ports required

Controller Node

  • Don’t need to be super powerful, but need to be redundant
  • If using FWaaS, no Networking node is required
  • Fast SSD required
  • Hardware TBD

Networking Node

  • Dedicated networking nodes are better, but more expensive. Networking can also be run on the controllers / undercloud
  • Best if non-FWaaS is used
  • Lower core-count, high clock speeds
  • Hardware TBD

Compute Node

  • VMs are gonna be running here: high RAM, high core-count machines.
  • Large amounts of SSD required (for VM images)
  • Dell R720 (cheap, slow(er))
  • Dell R730 (sub 1k, pretty fast)
  • Dell R740 (latest gen, expensive, fast)

Storage Node

  • Dedicated mass-storage nodes (100TB+)
  • High core-count, low RAM, many PCIe lanes
  • Running services: Cinder (block storage) Glance (image storage) Swift (object storage) Manila (filesystem storage) ** Ceph (general, tentative)
  • SSD cache?
  • Hardware TBD
  • ZFS? Or RAID?

All of these machines need a few network interfaces to connect with each other, and choosing the right set of hardware is crucial.

Networking

In order for the networking to work, we will need some hardware. Namely, we will need a VLAN-capable switch, some network cards.

First, we have to pick a target speed that your network will run on.

  • 1GbE (cheap, slow, don't recommend)
  • 10GbE (cheap-ish, fast)
  • 40GbE (not cheap, faster)
  • More (def not cheap, fastest)

Once you've made your decision, you have to pick a hardware interface that the networking will use (copper, fiber, Infiniband)

Infiniband

  • Made by Mellanox (now Nvidia Networking)
  • Secondhand is inexpensive
  • Up to 100GbE (future-proof)

Fibre

  • Ports: Small form-factor pluggable transceiver
  • Up to 200GbE (more with special hardware)
  • VERY expensive, delicate cabling

Copper

  • “Traditional” Ethernet (RJ-45)
  • Up to 40GbE (after 10GbE, gets very expensive)

Personally, I'd choose Infiniband because their hardware is very inexpensive second-hand, and is very fast.

You'll then need:

  • Two network cards / node (dedicated network nodes can have more)
  • Four VLAN capable failover switches
  • A lot of cables

The architechture of the network is important. We are going to run two networks: the internal "cloud" network, and the public network which is going to expose our machine to the internet. Every machine should be connected to both networks. A dedicated router/firewall is recommended.

Firewalls

This is entirely up to you and you can choose something else if you don't like either of the above options.

Software

There are two layer to the software that we are going to run: the Infrastructure layer, where we will manage the actual hardware, networking, and virtual machines; and on top of that we will have the application layer, where we deploy programs and apps to be served to our clients.

For our IaaS (Infrastructure as a Service), we will use OpenStack. Openstack is "The Most Widely Deployed Open Source Cloud Software in the World", with customers like Warmart, PayPal, Blizzard, Deutsche Telekom, Baidu, CERN, T-Mobile, Target, Progressive, Yahoo!, Overstock, DirecTV, Tencent, Verizon, GAP, Nike, and a whole lot more. This will form the basis of our personal cloud. Next, we have to determine how we are going to run apps on top of Openstack. For our PaaS (Platform as a Service), we will use Openshift. This is just a fancy Kubernetes (container cloud platform) that is better suited for our use as it can integrate with Openstack better. For our edge router, we are planning to use HAProxy, as it is a mature piece of software with a lot of support behind it.

Software Architecture (;TLDR):

  • TripleO (Openstack On Openstack) architecture: https://docs.openstack.org/tripleo-docs/latest/
  • Deploy Openshift (OKD) on Openstack: https://docs.openshift.com/container-platform/4.5/installing/installing_openstack/installing-openstack-installer-custom.html#installing-openstack-installer-custom
  • Applications (e.g. website) are deployed on OKD
  • Infra (e.g. Docker, OKD) is deployed on Openstack

Until the next post, I recommend you familiarize yourselves with Openstack and Openshift, as the rest of this is going to be very heavy on both of those.

Previous Post Next Post