If like me, you have been reading a number of articles each day with the words pandemic, unprecedented and stockpile in them I commit, in an attempt to provide some advice and escapism, that this article will not use any of those words again. Instead I will focus on our fictional character Bob, he has recently read the book by Tim Ferriss called the 4-Hour Work Week. Bob needs a break; he dreams of living in Hawaii for 6 months without his boss knowing. So, Bob this one’s for you!
There are four key areas to focus on when considering remote datacentre operations.
1. Monitoring & Management
If you are halfway around the world on the beach (or even if you are physically present), knowing what is going on in your environment is critical. The challenge with remote working is that you need to be able to gain timely and appropriate insights into physical and system issues, without being overwhelmed by alerts:
ensure Bob is alerted to changes in state, for example events such as component failure. This should track issues ranging from failed disks, power supply failures, expansion card issues or port failures.
the systems are still running, and configuration changes might still be occurring in the environment, so monitoring needs to pick up changes in state for ports dropping, systems going offline, events happening within the operating system or simply not responding as normal.
being able to relate the alerts and performance criteria back to applications will ease troubleshooting and spot bottlenecks in the environment before end users are affected. Imagine the praise when resolving month end application issues for finance, all whilst in his speedos by the pool!
the one Bob has been putting off for years! Now is the time to tidy up the tools and how they alert and report. He will need to ensure systems required are reporting with the right metrics in the right time. Therefore, removing the need to trawl through thousands of false positives every day.
this is a capability to allow automatic reporting of issues to support from vendors or managed services. When enabled it will allow Bob to enjoy his sundowners and rest easy in the knowledge that any physical failures will be raised to his support contract. They will send a part and engineer where appropriate to rectify the problem. Contracts will need to be checked to ensure this is part of the service being offered as it can incur additional costs.
capturing statistics around how a system is performing will allow Bob to identify changes in events in near real time, and trend analysis can show when a system could become overloaded due to increased workload.
if possible, going beyond applications to build the monitoring and management tool around service offerings. This paradigm shift from monitoring purely the infrastructure layer to a service orientation will allow Bob better prioritisation and impact assessment of issues as he will better understand the priority and impact to the business.
2. Remote Access
Being able to perform remote actions within the datacentre environment will be required to check health status, access platforms and make configuration changes. Being able to connect whilst systems are up, using protocols such as SSH, RDP or your hypervisor management tool through a secure channel, will be the starting point.
In case primary links are down, secondary access to systems utilising a backup connectivity method in order to connect to Bob’s network to review any issues and ideally bring everything back online. It is also important to consider access to other areas of the network and ability to perform remote hand configuration. This will be for activities such as configuring network ports, out of band management or connecting to management networks that may not be presented across normal remote connectivity tools. All of the activities where Bob would typically walk down to the server room to patch into a switch or connect a serial cable, will need to be considered to build a method of doing these remotely. For all of these aspects it is important to review the security implications and ensure an attack vector is not being opened into the network. This should involve utilising technologies such as firewalls, multi-factor authentication, trusted devices and VPN to ensure infrastructure access is as secure as possible.
3. Capacity Planning
Whilst Bob is away, learning to play the ukulele, the business will need to continue.
Where he has time to plan for expansion he can work with vendors and lead times can be accounted for; however where the timeline is more pressured he will need to consider faster and more tactical options.
This could be the time to use Public Cloud or managed service offerings, providing compute or storage on a ‘pay as you use’ model so that he can flex up and down as the demand changes.
For environments that are already Cloud enabled this will be a known resource, and further blogs will be coming that expand on this further. Areas of Bobs infrastructure are new to Cloud, so leveraging storage offerings through a gateway or proxy could be a way to provide storage for systems in a safer environment where the Cloud consumption and control is masked behind this broker layer.
4. Data Protection
Bob is more and more aware that data is key to the success of his business, but is also a primary target for cyber attack. His workforce is more diverse, and potentially at higher risk due to the limited control of his endpoints, therefore he will need to ensure he maintains his endpoint practice and re-asses a few key areas.
As with monitoring and management, being able to access the reports and alerts required in a timely fashion so that Bob can resolve these through the appropriate remote connectivity methods will be critical. His data protection strategy still utilises tape as a method for data retention. This is for long-term retention and to ensure data is on a different form of media. This needs to be reviewed as it presents a challenge changing tapes when he is not physically present. Workarounds to this include:
- Adjust the tape rotation to over-write tapes in the library where possible. This will not meet the long-term retention needs but can act as an air gap copy of the data. He may be able to adapt his 12-week cycle, down to 1 week and rotate tapes in the library.
- Increase the amount of data kept on disk, this could be local disk, network storage or a backup optimised target that incorporates deduplication and object storage. The specifics will need to be explored for each environment but extending disk storage can be an effective way to reduce or remove reliance on tape, even if just for the short term.
- Re-configure backups to commit data to the Cloud as an off-site repository rather than tape. This will require some Cloud storage but will not only remove the need for manual handling of tapes, but also lead to more restore options and potential lower recovery time objectives.
- Possibly a last resort (but Bob is very ambitious in his plans), consider implementing a Cloud native backup solution or managed service for those critical workloads where current tooling is not appropriate. This can act as a quick workaround for areas of high risk and sit alongside your existing data protection environment.
Data is key to the success of his business, but is also a primary target for cyber-attack.
After reviewing all of these areas, making decisions that will serve Bob well today as well as when he is on the beach in Hawaii, will enable him to look at how he operates his Infrastructure in the future.
As many of our customers will already have a data protection tool in place, we have pulled together a collection of resources if you wish to enable Cloud storage capabilities. The guidelines below relate to the most recent versions of the most common technologies our customers utilise. There may be some minor differences in the specific configuration steps in your environment, so ensure caution is exercised before making changes to your production environments.
If you require any assistance, Softcat’s team of advisors and consultants will be happy to discuss your particular requirements, including fast provisioning of Azure or AWS tenants in line with best practice, should you need to create a storage target for your backup platform to consume.
We are all facing new and challenging times, both professionally and personally, and now is the time to make sensible decisions that protect our Infrastructure and can enable us to work as effectively as possible whilst working remotely. Hopefully this article goes some way to help you do this and help Bob’s dream of living in Hawaii become a step closer to reality.
Look after yourselves, and each other.