Break from the usual electronics project posts and time to discuss some server management.
I’ve always had a server. Even in 1986 I had a server. We called it a “BBS” back then, but it was still a server. Throughout the years, the server has evolved, from Lantastic ARCNET to 10base2 coaxial Ethernet, to Gigabit Ethernet. It’s gone from 20 megabyte hard drives to a RAID array of 4 Terabyte hard drives. The server is not just a place to store files, but it’s also a place to run compute jobs. If you want to host a music or video library, where do you do that? The server. If you want a long-running VM for work, where does that happen? The server. This post discusses the latest evolution of the server, disaggregating it into a Synology NAS together with Intel NUCs.
Goliath <The Old Server>
I built Goliath back in 2012. At work we were using a lot of Dell R410 servers, which seemed like a beefy well provisioned server, so I attempted to create something similar at home from commodity parts I purchased from Newegg. I didn’t get the latest and greatest CPU at the time, which would have been cost prohibitive, but chose something a little less ambitious that didn’t break the bank. This is what Goliath consisted of:
- An Antec 4U rackmount chassis, because rackmount gear is cool (and durable and easy to maintain).
- A SuperMicro X8DTi-F-O motherboard. This motherboard supported Dual Xenon CPUs and IPMI.
- Two E5645 Xeon 6-core CPUs
- 36 GB (4 X 4GB) of ECC memory. Yes, 36 GB not 32 GB. This was Intel’s weird chipset that used triple-banked memory.
- A 3WARE 9650SE-4LPML RAID card with Battery Backup. This was a really nice card, and the battery backup ensures that even if you pull power from it while it’s running, it doesn’t have to rebuild the array.
- Four WD RED 4TB NAS hard drives, configured in two RAID1 arrays. Back in 2012 I probably started out with 1TB or even 500GB disks, but 4TB are what are there now.
On Goliath ran the CentOS operating system exporting files via NFS and Samba, as well as running VMWare Workstation to host VMs. Most of my virtual machine work is over SSH, but I would connect to Workstation over a VNC session to actually launch the machines. Using Workstation, a desktop product, rather than a bare metal product or other server virtualization product had the advantage that Workstation had a pretty good GUI and supported features like easy snapshotting. Snapshotting is really nice when you’re doing development on VMs.
The problem with Goliath is that walking past Goliath is like walking past an electric space heater. The fancy SuperMicro motherboard and its dual Xeon CPUs give off a lot of heat. All that heat costs money in electricity. Then the heat needs to be air conditioned back out of the room, costing even more money. By today’s standards for all that energy consumed, it’s not even particularly fast. Yes, 12 cores can do a lot of work in parallel, but many tasks are single-process in nature and the E5645 just does not have good single core performance by today’s standards. On top of that, the inefficiency is bad for the environment because you’re wasting energy to work less efficiently.
Goliath has got to go, and be replaced by something much more efficient.
Disaggregation
I never really liked having The Server both responsible for holding file and for running compute tasks (VMs). Even though VMWare Workstation provides excellent isolation, there were times where an errant VM would go full throttle on resource consumption and cause the CPU fans to suddenly spin up to high speed. I could occasionally actually see a cloud of dust rise from the rack when this happened! VMWare Workstation would sometimes become sluggish and even nonresponsive, unable to delete a misbehaving VM. More than once a VM became so un-shutdown-able that I had to reboot the machine to get rid of it. Excitement like that isn’t good for a file server. A file server’s life should be boring. So let’s split The Server into components that make sense.
Serving Files: Synology NAS
We still need to serve files, so let me pick a dedicated file server. Obvious choices were Synology, QNAP, Xpenology, FreeNAS, and UnRaid. I’m at a point in my life where I sometimes I like infrastructure hardware to just work with the minimum amount of tinkering. That rules out the DIY solutions based on Xpenology, FreeNAS, or UnRaid. I’m sure they’re fine, and I could build a nice rackmounted DIY file server for less than I would spend on an equivalent plastic Synology box. But, someone already designed the Synology and made sure all the parts work together and made sure the software is maintained and patched and all that. Synology seemed more popular than QNAP and people rave about the Synology software, so I went with Synology. The DS1019+ to be precise.
[NOTE to self: Should have bought the DS1618+ instead. The DS1618+ has a pcie slot for a 10Gbe NIC, and would have future-proofed the NAS for a while. The DS1019+ has two gigabit nics, which you could bond to increase performance across multiple hosts, but not necessarily increase the burst performance to a single host. Bonding doesn’t work like most people (including myself at the time think it does) — you’re often hashing a MAC and/or an IP to select which interface on the bond is used. Connections between a pair of hosts will end up on a specific interface. It’s great if you have many hosts using the file server, but doesn’t give you that burst when you want performance to one specific host]
The DS1019+ offers 5 drive bays and it has two slots for SSD caching. 5 drive bays is great for running RAID6, you end up with about 3/5 of your raw storage converted into usable redundant storage. RAID6 permits two drive failure. “Surely two drives won’t fail at once!“, you say. The problem is that the time you’re most likely to have the second drive fail is during the rebuild that occurs after the first drive failed. Losing a drive during rebuild would really wreck your day. Go with RAID6, not RAID5. The Synology supports something called SHR-2, “Synology Hybrid Raid with 2-disk failure redundancy“, and I ended up choosing that over RAID6. Without looking into the specifics of implementation, I suspect it boils down to approximately the same thing, albeit with SHR permitting easier expansion of the array by slicing disks and doing RAID across those slices.
In addition to serving files via traditional protocols such as NFS and SMB/CIFS, the Synology can easily be configured as an iSCSI server, and that’s going to be handy to connect to VMWare ESXi.
The Synology can also do all kinds of other things, including running Synology’s Surveillance Station software for recording IP cameras, running backup tasks, and even hosting VMs and docker containers. As stated above my goal is to not not host arbitrary VMs and/or containers on the fileserver.
Serving VMs: Intel NUCs
The Synology itself can serve docker containers and VMs, however I want to keep that to a minimum. 1) You don’t want to bog down the file server with compute tasks, and 2) You don’t want compute task misbehavior to jeopardize integrity of the fileserver. There’s both a performance reason and a reliability reason. As I said above, a file server’s life should be unexciting.
For my compute services I decided to go with Intel NUCs. These are tiny efficient brick-sized computers. Performance-per-dollar can be a little expensive for a NUC as there are cheaper solutions out there. However, the NUC is well integrated, well build, and very power efficient. I happened to already have two NUCs on hand, a 5th generation NUC5i7 and a 4th generation i5 D54250. I’ll refer to these as the i7 NUC and the i5 NUC. Both of them feature gigabit Ethernet and support up to 16GB of RAM. This is perfect for hosting some virtual machines.
There’s lots of software available for hosting VMs these days — everything from OpenStack to VMWare. I’ve traditionally been a VMWare user, so I went with VMWare ESXi (“The vSphere HyperVisor“). You can run ESXi off a USB key, using an iSCSI share to store your VM disks. This gives you an almost entirely diskless solution. You can perhaps go fully diskless as well, using PXE boot, but the USB key seems like an easy and preferred solution.
ESXi is mildly unhappy that the NUCs have a single Network Interface Card (NIC). It doesn’t seem to hurt anything, but a redundant infrastructure would use two NICs. Looking around online, there are ways to solve this problem. For example, one solution would be to use a USB3 Gigabit NIC. Another solution is a Gigabit adapter that will fit in a mini PCI-Express socket, such as https://www.amazon.com/dp/B00B524102. I haven’t tried either of these as the lack of redundant network connection has not been a problem, but either are reported to be viable solutions.
One NUC or two? or three? or four?
So how many NUCs does one need? That depends entirely on the type of work you do. I’m confident that I can fit all of the VMs I use on a daily basis on a single NUC if I wanted to. However, spreading them across multiple NUCs will have performance advantages. A reasonable approach seems to be to stick every VM that’s just plodding along using little resources all day long on my single less-performing i5 NUC, and reserve the i7 NUC for running the jobs that are performance sensitive — compile and build tasks where I’m sitting at the keyboard waiting for things to happen.
Redundancy
I put the word Redundant in the title so I should say a few words about it.
- Storage Redundancy. The Synology is running RAID6, so the storage itself is clearly redundant. Two disks can fail and the consumers of the storage services can continue to operate.
- File Server Redundancy. While the disks in the Synology are redundant, I only have one Synology box itself. If the power brick for the Synology fails or if the Synology motherboard fails, then I can’t use the disks anymore. There are ways to run Synology servers in a redundant manner, if you’re just willing to spend twice as much money and own two servers. To save money, I’ve decided that in the unlikely case that the Synology hardware fails, I’ll have to wait two days for replacement hardware to arrive from Amazon.
- VM Server (Compute) Redundancy. As the VMs are stored via iSCSI on the Synology, there’s no significant persistent data on the NUCs themselves. I can shutdown a VM on one NUC and bring it up on the other. If a NUC fails, I can bring all of the VMs up on the other NUC. There is VMWare software (VCenter) that would help automate this, for a price, but the ESXi software can do it directly via the embedded web GUI with a little more manual intervention.
- Switch/Network redundancy. There’s still Ethernet that goes from VM to Server and to my desktop PC, and I have a single big switch sitting in the middle of that. Yes there are ways to implement redundant network infrastructure (at twice the cost), but I’ve decided that I can always pull a switch, even an older sub-optimal switch, out of my box of old networking gear should my switch fail.
This is perhaps also the place to make the point that “RAID is not Backup“. Being able to survive a hardware failure in a disk or a sever is one thing, but being able to survive a software or human failure is another. The RAID doesn’t protect you against accidentally running “rm -rf *” in the wrong place, nor does it protect you from the server getting struck by lightning, nor does it protect you against malware. One still needs a whole hierarchy of backup even if one has a RAID. You need both on-site backup (cost effective) and off-site backup (disaster resistance).
The Result – Performance and Power Consumption Analysis
Let’s get to the bottom line first, before digressing into all of the various issues.
Performance
To test performance used a simple 2-minute build job running inside a VM. This is a task I had been performing a lot lately, so it’s a real-world development workload. It executed in 1 minute 20 seconds on the i7 NUC-over-iSCSI versus 2 minutes 3 seconds on Goliath. That’s a 35% improvement in speed, while going from a old fancy enterprise-quality Xeon CPUs to a newer commodity low power i7 CPU. It’s perhaps not surprising that a newer CPU outperforms a substantially dated CPU, but I did have my concerns due to the significant change in class of CPU.
Energy Consumption
My entire server rack (server, switches, NUCs, etc) is connected to an APC UPS that I collect data from automatically using Prometheus and visualize it in Grafana, so it was easy to pull existing energy consumption numbers. Previously, with Goliath installed the rack was consuming 250 watts. With Goliath decommissioned and replaced with the Synology and two NUCs, energy consumption is 120 watts. Breaking down that energy consumption, approximately 40 watts is from the two NUCs. The rest of the energy is for the Synology, Ethernet switches, Comcast cable modem, etc.
The Bottom Line
My workloads execute 35% faster, I’m using half the power, I moved to two-failure disk redundancy, and the Synology UI is really user-friendly to operate.
The Journey: Resolving all the problems
This is where I document all the things that went wrong.
The NUC’s fan is too noisy
One of my work-related tasks is a short 2-minute build job. I noticed that near the end of this job the NUC’s fan would spin up and it would sound like a jet aircraft was occupying the other half of the office. The problem here is that NUCs tend to accumulate dirt between the fan and the cpu heat sink. This dirt impedes the flow of air. If your NUC has been sitting around running for more than a year or two then you probably need to tear it down, remove the fan, and clean out any accumulated cruft. After doing this my NUC no longer spins up for simple short jobs.
Sustained load can still lead to a noisy NUC. There are solutions to that — such as the fanless Akasa cases. The Akasa case looks really nice, albeit quite large (you’re basically trading the NUCs usually tiny size off against having a massive aluminum heatsink-case). If noise becomes a problem, I’ll get the Akasa case. It’s been a couple weeks, and noise hasn’t become a problem.
Duplicate MAC addresses from hell
The first NUC I brought up with ESXi worked fine, but the second was a mess. The GUI was sluggish and ESXi couldn’t talk to the Synology.
The issue turned out to be that I had two ESXi hosts that shared the same MAC address for the internal vmnet adapters. I stumbled across this while looking at arp tables. What had happened is I followed an online guide for creating a bootable ESXi USB key by using a VM to boot from the ESXi ISO and installing onto a USB key. I repeated this for the second NUC. The problem turned out to be that the ESXi installer uses the installation machine’s MAC address when it configures the networking stack. When you boot that USB key in another machine, it still has that MAC address from the installation machine, and it keeps using it, even though the USB key is an entirely different host. So I ended up with two USB keys that had the same MAC address. The was confirmed by running a set of commands in an ESXi shell:
esxcfg-nics -l esxcfg-vswitch -l esxcfg-vmknic -l
The above will let you inspect various devices and see the MAC addresses. To fix the issue, I did this:
esxcfg-advcfg -s 1 /Net/FollowHardwareMac
This will tell the vmnet adapter to use the MAC address from the hardware adapter. After that, both NUCs were working again fine. I’m really surprised things worked at all with the conflicting MAC addresses. It must have just been luck that a few packets were getting through.
Only one ESXi host can use the Synology iSCSI LUN at a time
This is actually a feature. Normally you don’t want two different hosts using the same disk. iSCSI is effectively a disk, so the default is to prevent this. You don’t need the safeguard with ESXi though, as ESXi has the necessary protection in place to share the iSCSI device. Just go into the Synology interface and make sure “allow multiple sessions” is enabled for the iSCSI target lun.
The ESXi GUI crashes every time I try to import an OVF
This was really pretty disappointing — ESXi 6.7U2 apparently ships with a broken embedded web client. You could sometimes, but not always, get the OVF to import by hitting ESC when the error screen popped up. The error message spewed out is similar to this one:
Cause: Possibly unhandled rejection: {} Version: 1.33.3 Build: 12923304 ESXi: 6.7.0 Browser: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36 Exception stack: TypeError: Cannot read property 'keyValue' of undefined at updateSummaryPortlet (https://192.168.23.206/ui/scripts/main.js:375:415) at $scope.wizardOptions.onFinish (https://192.168.23.206/ui/scripts/main.js:375:5968) at https://192.168.23.206/ui/scripts/main.js:324:23176 at m.$digest (https://192.168.23.206/ui/scripts/main.js:324:28780) at https://192.168.23.206/ui/scripts/main.js:324:30503 at e (https://192.168.23.206/ui/scripts/main.js:323:10071) at https://192.168.23.206/ui/scripts/main.js:323:11522
“Possibly unhandled rejection”? Who the heck writes error messages like that? There’s no indication what exactly went wrong or what the user should do to fix it. Fortunately I found out online that going back to an older version of the embedded client will fix this. Specifically, execute the following command in the ESXi shell:
esxcli software vib install -v https://download3.vmware.com/software/vmw-tools/esxui/esxui-signed-12086396.vib
The bad version is 1.33.3-12923304, and the good version is 1.33.1-12086396. You can check which version you have with “esxcli software vib get -n esx-ui”.
It’s possible that ESXi 6.7U3 already fixes this, I haven’t done an upgrade to find out. It’s a shame that an otherwise well-performing product was marred by an embedded web client that had multiple problems. The above crash wasn’t the only one.
The NFS user ids are wrong on the Synology
Typically I mount the file server onto various Linux VMs using NFS. Linux makes extensive use of user and group ids (UID and GID) and these are exposed via the NFS protocol. All of my Linux VMs have my user account setup with UID 500. The Synology comes setup with its user account configured as UID 1026 and there’s no obvious way to change this. The Synology is based on Linux, and it does use the familiar /etc/passwd file, but that file is generated from a proprietary database file. Synology includes a tool for operating on that database, but it lacks a command to change user IDs. You could get a UID greater than 1026, by creating subsequent users until you hit the ID you want, but I could find no way to configure the Synology with a UID less than 1026.
Now, this wouldn’t be a problem if one was solely using NFS with the Synology — the Synology will happily store and serve NFS files with a UID that it doesn’t understand. However, part of the value of the Synology is that you can access files from in a variety of ways. For example, I can use an SMB/CIFS share to mount them on a Windows box, I can browse through them with native tools on the Synology, etc.
In the end, I ended up capitulating and going to each one of my Linux VMs and changing my userid to 1026. For an Ubuntu VM this is fairly simple:
- Boot the Ubuntu VM into maintenance mode by pressing <ESC> during boot and selecting the appropriate menu item
- Exit maintenance mode to a root shell
- Remout root as read-write: “mount -o remount,rw /”
- Change the user id: “usermod -u 1026 myuserid”
It seems like being able to assign an arbitrary UID to an account on the Synology box is such a trivial feature to add, and the process of having to go around to every other NFS consumer and modify its UIDs is so onerous in comparison, that I’m surprised Synology hasn’t just implemented the feature. It’s a tiny feature to add, and would have great utility to those using NFS.
Backup Solutions
As I mentioned at the top of the article, RAID is not backup. RAID provides high availability and tolerance of hardware failure, but it does not protect against human error, local site disasters, or malicious attacks. For important data, one still needs to backup to a device that is separate from the NAS.
Cloud backup with Amazon Glacier
I’ve long used Amazon Glacier to backup large artifacts (VM images, tarballs, raw video footage, etc). I do this manually with FastGlacier. Since I now have a Synology, and the Synology has a Glacier package available on it, I figured I’d turn this loose on one of the directories on the NAS and see what happens. This was a bit of a mistake. Glacier is inefficient in terms of both economics and performance at storing small files, and the directory I picked to backup on my NAS was full of them. I figured Synology would have written the app with this in mind and aggregated small files together, but no it just uploads them as separate archives. I should have known better than to just assume. Before I really realized what I had done, I had uploaded 30,000 files to glacier. It could have been worse, unsuspecting novices have uploaded millions of files to glacier and not realized what happened until they got the bill (Glacier charges $0.05 per 1000 transfer requests).
After this, I wised up an turned the glacier app loose only on directories composed of relatively few large files. To facilitate this on directories that have many small files, I wrote a script that tarballs them all together into large archives, and then I submit the large archives to glacier. This is how glacier is meant to be used.
Glacier is a nice cost-effective service, if you know how to use it properly and you don’t really care how long it takes to retrieve your data.
Local Backup
Cloud backup has the advantage of being offsite and therefore resistance to local disaster scenarios, but it’s also helpful to have a local backup solution in place, as local backup offers much greater speed of backup and restore than most cloud solutions. While researching this, I found that many Synology users use one of two solutions, either they 1) backup their NAS to another NAS, or 2) they backup to an external USB drive. Backing up to another NAS has the advantage that the backup itself may be stored on redundant media if desired, though it also has the disadvantage of having to purchase an entire second NAS. Backing up to an external hard drive has the advantage that it’s comparatively cheap, and you can use several external hard drives in order to have a rotating backup solution. The disadvantage being that as your NAS grows in size (the one I build for this article is already 12TB), it becomes less and less convenient to segment that up and fit it on external hard drives.
Thanks for taking the time to write this up. Looking at migrating from a ‘Goliath’ style Windows Server implementation. Food for thought!
You should check out AppScale’s opensource implementation of the core AWS APIs.
I moving all my R610’s of VMware vSAN to that. Less hassle and if I ever need to move something to the cloud it’s already AWS compatible.