Understand the Use Case
If you’re considering building a lab, chances are you’ve got a good idea of what you want to do with it. If not, then let me give you one piece of advice: THINK CAREFULLY BEFOREHAND ABOUT WHAT YOU WANT TO ACHIEVE. 10 GPU “super computers” can be great fun but are woefully unnecessary if you want a web development server. Likewise, a petabyte of redundant storage truly lives up to its name if you’re planning to use it to host a Doom multiplayer server. I know it sounds obvious, but it’s just too easy to get distracted by a great eBay deal and end up with a 900w paper weight.
I can already hear you shouting, “EBAY HO, BUY ALL THE THINGS”, but lets just slow down there for one second, soldier. This is a really, really good way to spend a whole lot of money on a stack of stuff that is worth more to a scrapyard than your homelab. There’s a ton of enterprise gear out there ready for the taking, but do your due diligence or you won’t get the good stuff. Instead, you’ll end up with a server with only one ethernet management port and no network card. It’s a bad scene when that happens, so let’s try to preclude it.
Common Use Cases
Most home lab users get started for very specific use cases. The starter use cases are usually:
- Game Servers
- Media Servers
- Storage and Archiving
- Web Hosting
- Certification Study
- Remote Access
- Development Servers
- Home Automation
- Crypto Currency and other Electric Waste
It is important to understand that your homelab project is not the first of its kind. There are an innumerable number of self hosted game servers, media servers, and self hosted web sites. Before breaking out the credit card, use (at-least) a google search to find a similar project which describes their setup to understand how the underlying hardware is being used. Understanding hardware is key to understanding what you will need instead of what you want or think you may need.
For example, a home media server doesn’t need a 24 port 10 G switch. It also doesn’t need an AMD Ryzen Threadripper 3990X 64Core 128Thread 2.9GHz 7nm sTRX4 CPU Processor. Both of those items will add expense and power usage, without providing a better media server. What is most important is overall disk space and making sure the disks are redundant (note disk speed is not of the utmost importance). With this in mind, you can search for a single server setup that can handle an appropriate amount of disk space for your needs.
Larger Use Cases
Larger use cases (or enterprise use cases) will combine multiple use cases from above along with new use cases, monitoring, multiple environments, additional access & control mechanisms, and more. An example of a larger use case can be an edge computing solution for home automation combined with a home security system. Another use case could be a web crawler, archiver, and data processor to create a custom search engine. A personal use case of my home lab is snapshotting/archiving documentation of older products and software so that if said documentation were no longer hosted by a third party, I’d still have access to it.
When diving into larger/enterprise use cases you often run into a different level of concerns. This particular blog post wont dive into such details; just understand there are additional costs that comes with enterprise setups. Backup power, layers of redundancy, DR & backup strategies, mitigating for natural and manmade disasters, and more. Each of these will increase cost and slowly transform a home lab into a datacenter.
Find a Deal
Keep in mind that this post is meant for someone who’s already started labbing, but wants to up their gear to do more and doesn’t know where to begin.
The vast majority of us all started with an old PC or leftover parts from a previous upgrade or, maybe, from that box your parents didn’t need anymore after that they got a new present. Maybe you volunteer to take it and clean it for them and begin to use what they left behind. Personally, this is exactly how I got my original NAS.
I’d be very surprised if you’re not already sitting on a pile of old parts in some way, shape, or form. If you weren’t the kind to collect parts, you probably wouldn’t be labbing. Even if not, if all you have is one PC, use it. These days we have VirtualBox, which does a fine job of running just about everything you might want to try out. It might be a bit slow, but you can get started while you wait for your tax return/birthday money/lottery winnings to get here.
The key point is that nothing about learning the basics of homelab setup requires enterprise hardware, except, of course, for learning how enterprise hardware itself is laid out. That has its merits, but most of it can still be learned from building your own PC. Coding, Linux, FreeBSD, Win 2012 R2, containers, hypervisors, networking, storage; all of it can be done with a fairly recent laptop or desktop.
Let’s discuss the different types of hardware options you’ll encounter; hopefully, this will save you from traveling an expensive learning curve. You don’t want to end up with a server that is so old it doesn’t support virtualization. Older servers can be power hungry and scream every time you turn them on. These tips will hopefully help you understand the difference between a $150 paperweight and a $200 deal.
This is such an important issue to me, because I have witnessed those uninitiated in homelab quickly lose their enthusiasm when they end up getting Pentium 4-era Xeons that are practically worthless. I point this out not to pound those who have just started with homelab builds into the ground, but to point out that if you don’t research, ask around, and make sure of what you’re getting, you could end up getting worthless hardware without even knowing it. And, trust me, it’s not always easy to see when you might be headed down this path. I speak from experience.
There are a number of guides on the internet to help with buying of used/refurbished/old servers. Using your search engine of choice will lead you on many adventures. It cannot be stressed enough that you should understand your use case before you purchase a machine. Here are a list of questions to ask yourself:
- What kind of connections does the motherboard provide for hard drives?
- Does the server have a raid card?
- If the raid card fails, how hard will it be to replace?
- If a drive fails, how hard will it be to recreate the raid cluster?
- What is the maximum memory supported by the raid card?
- Is this server primarily reading or writing data?
- Is the primary reading or writing a central focus of this server?
- What level of redundancy is needed for this data?
- Can this server use a NAS instead of local hard drives for the non-OS (or all) data?
- Will this server need to “trust” the hard drives attached to it? (A server may not be able to read the temperature of a hard drive and consider it to be overheating. The fans will then go full blast driving up the energy consumption and noise generation of the machine. This is a problem in servers like Dells, where there is an expectation of a Dell Certified hard drive)
- Does the server have a raid card?
- What are the network throughput needs of this project?
- Is the network card fast enough for this project’s needs? Is the switch/router it is connected to fast enough for this project’s needs?
- Does the card provide enough ports for the considered management setup?
- Does it provide redundancy at the card or port level?
- If the network card fails, how hard will it be to replace?
- What are the memory needs for the project and what are the memory options provided by the motherboard?
- Not a question, but a note – use ECC RAM. Servers are not personal use computers and with multiple workloads running on them, ECC RAM can prevent a systemic crash that destroys all the workloads on the server.
- Another note, don’t use DDR2 memory. Its a power hog and getting harder and harder to replace.
- Does the motherboard except UDIMM, RDIMM, or LDIMM and in what configurations?
- What RAM is currently available from other projects to reuse?
- Are any processes or workloads memory intensive or is RAM general use?
- What level of compute power is needed?
- Does the motherboard for this project support the expected CPU?
- Does the CPU support the RAM for this server?
- Does the CPU support virtual machine passthrough (Intel VT-d or AMD-Vi)?
- Are vendors readily stocking this CPU?
Places to purchase
The primary place to find “deals” on retired server equipment would be eBay. eBay serves as a single point where recyclers, repurchasers, and refurbishers can sell IT equipment. In fact, most shops will have multiple “stores” that they use so that they can have a single location with different store fronts. Some shops will have a brand name that is its own web store. eBay is the place that I personally go to first when I am bored and want to look at stuff I will never buy.
There are a number of different categories to check that are not eBay: (It should be noted that this section is from an American perspective. If you searching elsewhere this guide may not be perfectly applicable to buying in your region)
Local Electronic Recyclers
Electronic recyclers are sometimes tasked to clean out old data centers. This leaves the recycler with enterprise servers and networking equipment that will need to be sold. Some items are best to pick up in person. Renting moving equipment and moving server racks to a house or office space from an electronic recycler can save thousands on such a purchase (from personal experience). Personally, I have built/purchased both my mobile testing platform and my server racks from a local electronics recycler. It’s as simple as setting up an appointment with the recycler and taking a tour of their warehouse. There may be more than just the equipment for the project being planned in there for purchase.
The major benefit of visiting an electronic recycler is that they may be willing to make a deal NOW. You are there, you have money, and they do not need to ship out the product. This can reduce their costs and in turn pass that savings on to you. However, make sure that you can move and transport the items that you bought. Server racks can weigh upwards of 400lbs and not fit in a standard rental box truck standing up. Make sure you can transport whatever you buy and that it will fit, not only in the room you purchase it for, but through the doorways to the room in question.
A surplus store in the Commonwealth of Nations sells items that are used, or purchased but unused, and no longer needed. A surplus store may also sell items that are past their use by date. Additionally, there are government auctions for similar property where some amazing deals can be found. Note, these amazing deals are sought after by many personal and professional hobbyist, so don’t expect too much of an amazing deal.
All that can be said for online sales in this article has been stated. Anything not said should be known from your own online shopping experience. For the sake of being somewhat useful, here is a list sourced from the reddit homelab wiki buying guide:
- http://ebay.com – the best place to get used servers
- https://www.homelabtech.com – Refurbished Servers, Storage, Networking, and Parts
- https://www.orangecomputers.com – Refurbished Servers, Storage, Networking, and Parts
- https://www.metservers.com/ – Refurbished Servers, Desktop, Networking, and Parts
- http://www.pennelcomonline.com/ – Racks, rack accessories, or modular parts (rack strips, corners, cross-members for custom racks etc)
- https://www.ispsupplies.com – WaveGuard WG-UB-RM1 Rackmount kit for Ubiquiti EdgeRouter Lite and similar models.
- https://www.startech.com/ – Racks, rack accessories, and hard-to-find or otherwise niche tools.
- http://www.navepoint.com/ – Racks, rack accessories, and general office equipment
Using the Cloud
The cloud can be utilized to keep costs down. You read that last sentence correctly, it can be used to keep costs down. From a business perspective, it can be utilized to shift capital expenses to operational expenses. For a home lab, it can be used so that $10,000 in equipment cost can instead be spread out month to month over the course of years. As a Microsoft MVP for Azure, I have a good sense of when to use the public cloud vs when to invest in the private cloud. Hopefully, this section can provide a quick guide to when and where your project can benefit from either.
A thought that should be shared is that the entire integration with the public cloud can be dynamic, if you so choose. From the VPN components to the different offerings being consumed (unless there is a need for persistent state), the items can all be created on demand. This is said with the understanding that certain items require physical components and long term contracts. If your project requires those parameters, then the project may fall outside the definition of “home lab” being used here. Also, some items, like a VPN Gateway in Azure, may take a half hour to an hour to provision on demand. For a home lab, some pre-planning may be required due to those time constraints (as compared to an enterprise environment, where all those items will be persistent).
For a home lab, the primary purpose is to own and house the equipment running your projects. That being the primary purpose, does not mean there are no other benefits to using the public cloud in a hybrid scenario. The following are a few scenarios where using the public cloud could help reduce costs:
From the above listed options for projects, some could benefit from being able to scale out due to demand. Web servers, game servers, development servers, and more may have inconsistent demand. If your project involves a forever online game server and suddenly one thousand of your closest friends plan to play together one night, then there may be a need to scale out beyond the capacity of your home lab.
Assuming the project is set up for this scenario, hosting it in the public cloud may be as simple as changing a public DNS entry and uploading your virtualization configuration of the server to a public cloud provider. An example would be Minecraft server running inside of a container. It can be quickly uploaded to something like Azure Container Instances for the evening and cost a fistful of dollars. Compared to the thousands in hardware costs that would be needed for that one evening, the public cloud can provide the required infrastructure for a fraction of the cost.
Another example could be a public web server for a one day conference. For ~360 days out of the year, the site will receive one or two hits a day. When the week of the conference arrives, it may suddenly get hundreds or thousands of hits per day. Instead of running the site in the cloud the entire year or spending capital enough to host it in your lab for the week of the conference, use the cloud the week of and the rest of the year use the home lab (private cloud).
Some of the key tenants to a proper disaster recovery protocol is secondary location, offsite storage, or any other type of physical separation of the recovery environment. This acts as a hedge to a physical disaster in the private cloud region. This multi-locality is a primary tenant of any cloud hosting but in the home lab scenario is mostly not feasible. A public cloud offering can be a cheap disaster recovery option for the home lab. Having encrypted backups of configuration and servers paired with separately located media backups of sensitive data can be combined to form a DR strategy for the home lab.
Not everything will be cheaply hosted in a public cloud for DR purposes. If the project is a media server or data lab, then the storage fees on any media not hosted within the lab may prove to be too costly. The DR scenario that the cloud can help with, on a home lab budget, is one where the underlying data is small enough to make running costs too high.
The recovery strategy used in my personal lab starts with core infrastructure. First, the physical hosts are configured for virtualization and then a mix of VMs and containers are deployed to start:
- Certificate Authority (with root certs being loaded from backup physical media)
- apt-mirror & docker-registry
- Data Systems
- Web Servers
By using cloud offerings for some core infrastructure, both costs and restart time are minimized. For each of the following, an option could be:
- CA – Let’s Encrypt
- apt-mirror & docker-registry – use public free apt repos and docker hub
- DNS – use the name servers provided by the registrar
- LDAP – use Azure Active Directory where it can replace LDAP (don’t write me an essay about how AAD is not LDAP! I know its not but for something like an internal website or gitlab it can make a suitable replacement.)
Some services in the public cloud can easily out scale a home lab configured version for a lower cost point (even over the long run). Also, some offerings in the public cloud can make the completion of specific portions of the home lab project much faster. If you are building a machine learning setup, utilizing the dynamic compute capabilities of something like Azure Machine Learning to host notebooks or add compute power for the models will have a drastically lower price point than configuring the same in a home lab setup.
Another example could be adding a text messaging feature for two factor authentication. Adding in a twilio messaging account will be much lower than trying to add an entire phone. Similarly, using Office 365 or Zoho mail could be cheaper than any self hosted alternative. Moreover, the free tiers offered by Github are so fully featured now that self hosting is purely for the hobby and not for any feature benefit.