Prone to wander. Chapter 1

Andrew Cochrane

Chief Technologist: Datacentre

After a great response to the article published a few weeks ago, following Bob and his exploits for optimising remote data centre operations whilst he had a 6-month sabbatical in Hawaii, I wanted to continue with this theme. Whilst in Hawaii, Bob had good days and bad, but learnt a great deal and he is ready to optimise his working environment for any future escapades. Let’s review his notes: Vagabonding for a DC Admin!

Disaster recovery

Bob’s employer, CompuGlobal Megacorp, have plans in place for Disaster Recovery (DR) but it’s never been a top priority. This DR strategy involves tapes being stored in a remote datacentre with a hyperconverged infrastructure present; has about 10% of the compute capacity of their production site; with limited external connectivity and workplace recovery for 50 key workers. During the annual parallel fail over test, Bob realised that these disaster recovery plans relied on a physical presence near the datacentre for himself and the end users. Sipping on his 3rd Mai Tai before midday, he was struck by a need to do something different, or he’d be enjoying an early hangover.

Instead of dreading the testing of a DR strategy due to its complexities, and therefore avoided, it should be something that can be run anytime due to the confidence in the process and technology. Failing over workloads and failing them back with minimal disruption to end users is key. Bob realised he needed to remove or reduce his reliance on tape; this way he can automate the move of data and applications from anywhere in the world.

The hosting of this site needed to be reviewed. Quite often it doesn’t make sense to have a DR site the same size as production, but rather the ability to scale as and when you needed it is critical to a robust strategy. Utilising managed services, public cloud or implementing a consumption model for hardware procurement can all help to deliver against this strategy.

Finally, Bob must review the connectivity as part of the new hosting strategy to ensure end users and administrators can access these services with little or no reconfiguration in how they connect. This is no easy feat and can require a number of technologies such as global load balancing, high availability application design and software defined networking to name a few. The benefit of this approach is it removes the burden on a physical location and moves to de-coupling between the layers within an application and infrastructure. This not only enhances the approach to disaster recovery, but also to the agility of the business.

Workload portability

Bob realised that one of his main challenges was the ability to be flexible with his workloads. This could be due to resource constraints, data residency requirements or user location. These are not usually challenging when everyone is in one location, but as he travelled it meant this reliance on one location quickly became a burden. This was quickly compounded when Larry from Legal requested he move 500TB of data to an EU Datacentre due to compliance reasons. He needed to be able to better adapt and move workloads as and when required.

There are three main layers at which to consider portability:

Data Layer

When considering data portability we are primarily looking at unstructured data, as structured data would be included as part of the application. Unstructured data is exactly that: large amounts of fragmented and diverse file types with little or no context or structure for its storage. To tackle this challenge, Bob needs a data management solution that can monitor and manage this data in its primary location as well as the number of backup copies he keeps. Ideally, he wants to be able to do this across on-premises, cloud and in an array of file, object and block storage solutions.

Application Layer

Application portability can quite often be the most complex to enable but comes with the greater business benefit. Being able to transition an applications current state from one location to another, whilst enabling end user consistency, is extremely powerful in our digital world. The approach and tooling will change between applications, their architectures and the layers within the application stack.

Hypervisor / Container Layer

Enabling workload portability at the hypervisor or container layer can be a cost effective and simpler architecture method when compared to the complexities with the application approach. Hypervisors will need strong orchestration to enable the right virtual machines to be migrated and started in the correct order to avoid any corruption within the application. As part of the migration, it might also be a requirement to transform the virtual machine format, such as from VMware to Hyper-V or AWS EC2 or Azure Virtual Machines. Containers simplify the portability of applications as they are more abstracted between the application layers when compared to a traditional virtual machine. Again, orchestration is key here, but the transformation of the container can be removed as most on-premises and cloud instances provide support for the main kernels and engines.

Automation & orchestration

So far all of Bob’s lessons learnt have some involvement of automation or orchestration. He needs to look at the tasks that he performs and see if each task can be automated within the tool. Where he has a number of automated tasks, he can look to orchestrate these to schedule them in the correct order. This can be done in piecemeal within many tools today, but for a fully automated and orchestrated environment it is likely he will need to look to 3rd parties.

Ease of Setup

Looking at the ease to install, configure, run a discovery, integrate with hardware and applications up to the point where you can start automating tasks.

Ease of Management

Once all configured, how easy is the day 2 operations? Things such as reporting, updating, changes to existing configuration or adding new configuration to the platform.

Bare Metal Support

Bob still has some legacy hardware that he is unable to virtualise. He needs to check what support the tool has for physical servers and mainframes.

Network, Storage, Cloud & Hypervisor Support

Reviewing this across on-premises and cloud to ensure it covers the systems in his environment as well as a depth of features within them.

Operating System

Bob’s environment has a wide gambit of Operating Systems, he needs to ensure the tooling can integrate with Microsoft, Linux, Unix and Mac Operating Systems.

Scalability

Scale needs to be reviewed in terms of locations, on-premises, cloud and edge datacentres. As well as the number of devices that can be supported, so this is a breadth and depth of scale that is required.

Integration

Commonly a key selection criterion, the need to integrate with 3rd parties and the methods in which this is done. This will include line of business applications, ITSM, ITAM, cyber security tools, log aggregators, collaboration tools, device management and many more. Bob is aware of reference architectures and marketplaces where vendors have defined levels and methods of integration, this will enable him to be confident of the integrations supported.

Compliance Functionality

Quite often these criteria are not obvious in the selection process, but they are important to consider regulating compliance within these tools. Bob will need an interface for his administration purposes, but how will his actions and the actions of the platform be monitored? Having the ability to provide a compliance view to his colleagues like Larry in legal are important to Bob and his selection criteria.

…And breathe!

If you have read this far into the perspective, thanks for sticking with me! Even with only focussing on the top themes in this space, there is still a lot to discuss. To avoid overload and allow Bob to explore his learnings to the fullest extent we are going to deliver this blog in two parts.

I hope you will join me in the second part where we explore considerations for Virtual Desktop Infrastructure (VDI) and financing.