← Blog

Why Hybrid Infrastructure Still Matters

For universities, regulated enterprises, and public sector organisations, fully-cloud is often the wrong architectural answer. Hybrid infrastructure is not a transitional state; for many environments it is the correct long-term model.

Every few years the infrastructure conversation resets around a single claim: the future is fully cloud. Everything on-prem will migrate. The data centre is a transitional technology. Just finish the move.

That claim has been accurate for some organisations. For many others, universities, research institutions, regulated enterprises, manufacturing businesses, public sector bodies, it has been wrong for a decade and continues to be wrong. Not because those organisations are resistant to cloud or lack the engineering capability to migrate, but because their operational requirements do not map cleanly onto a cloud-only model.

Hybrid infrastructure is not a failure to finish migrating. In many environments, it is the correct architectural answer, not a transitional state waiting to be resolved, but the permanent production architecture.


Where the cloud-only narrative breaks down

The cloud-only model works best when workloads are relatively stateless, data gravity is low, regulatory constraints are manageable, and network latency between the application and the cloud region is acceptable. Many production environments satisfy none of those conditions simultaneously.

Universities and research institutions run high-memory compute workloads against large local datasets. Moving that data to cloud storage for every run introduces latency and egress costs that make the economics substantially worse, not better. The right answer for those workloads is local compute with cloud as burst capacity, not cloud as the primary target.

Manufacturing and industrial environments often have equipment with fixed network connectivity requirements, OT networks that cannot route through the public internet, devices that communicate on isolated segments, control systems that need sub-millisecond response times. Cloud-hosted control logic is not an option for those systems, regardless of the cloud provider’s regional footprint.

Regulated industries deal with data residency requirements that are still evolving. “The cloud provider has a region in this country” is not always a sufficient answer when the regulatory requirement is about where specific data may be processed, not just stored.

None of these are edge cases. They are significant categories of real operating environments, and the organisations that run them do not have the option of treating hybrid as a temporary state.


What hybrid actually means in practice

Hybrid infrastructure is not “some stuff in cloud, some stuff on-prem, not really connected.” That is multi-environment, which is a different problem.

Hybrid means the two environments work together as a single operational system. On-prem compute bursts into cloud capacity when local resources are exhausted. Cloud-based DR targets maintain warm standby from on-prem primaries. WAN edge devices bridge on-prem and cloud networks with routing that both sides understand. GitOps delivery chains target the right environment based on what the workload needs, not based on where it happens to be running today.

on-prem (Proxmox) -> IPsec/BGP -> edge (Hetzner VyOS) -> HA VPN/BGP -> cloud (GCP)
  - workloads                              - routing hub                  - burst + DR
  - PostgreSQL primary                                                     - Cloud SQL replica
  - RKE2 cluster                                                           - GKE burst target

The operational complexity here is real. Keeping that topology coherent, consistent routing, synchronised IPAM state, working replication across the boundary, GitOps delivery targeting the right environment, requires deliberate architecture. It does not happen by accident and it does not maintain itself.


The operational challenge

The reason hybrid infrastructure has a reputation for being hard is not that the individual pieces are particularly complex. VPN tunnels, BGP peering, PostgreSQL replication, and GitOps delivery are all well-understood technologies with mature tooling. The challenge is operating all of them together coherently across an environment that spans physical hardware, edge infrastructure, and cloud.

In a cloud-only environment, the operational surface is relatively contained. The cloud provider handles a significant amount of the undifferentiated infrastructure work, networking primitives, storage, managed services. The substrate is abstracted.

In a hybrid environment, the operator is responsible for the substrate as well. The WAN edge devices, the on-prem SDN fabric, the physical compute, the storage layer, the cross-environment connectivity. Those are your problem, and they interact with the cloud side in ways that require careful ongoing management.

This is not an argument against hybrid. It is an argument for investing in the operational model with appropriate seriousness rather than treating the hybrid boundary as a configuration problem to be solved once and forgotten.


Why the investment is worth making

The organisations that have built reliable hybrid infrastructure have something that pure-cloud environments often don’t: genuine operational resilience with a known and predictable cost structure.

They can run primary workloads on hardware they own, at costs they control, without paying cloud compute rates for always-on capacity. They can burst into cloud for peak loads without committing to the capacity year-round. They have DR capability that does not depend on keeping warm cloud instances running continuously at full cost.

Many environments that commit to hybrid architecture end up with platform engineers who understand the full stack in a way that pure-cloud engineers rarely need to. Operating physical hardware and cross-environment network architecture builds a depth of infrastructure understanding that cloud abstractions can obscure. Engineers who have configured BGP peering, managed Patroni failover, and operated a GitOps delivery chain across environments have skills and intuitions that survive any provider migration or abstraction change.


The architecture question

The interesting question for hybrid environments is not “when will we finish moving to cloud?” It is “what is the right architecture for the workloads and operational requirements we actually have?”

For many organisations, the honest answer involves local compute for latency-sensitive or data-heavy workloads, cloud for burst and DR, and a WAN architecture that keeps them coherent as a single operational system. That is a legitimate production architecture, not a transitional state, not a compromise, not a failure of cloud ambition.

HybridOps is built around this model: on-prem compute on Proxmox, cloud landing zones on GCP and Azure, WAN edge on Hetzner, with the operational tooling designed to treat the entire topology as a single system. The same modules, the same run records, the same verification paths, regardless of which side of the boundary the infrastructure lives on. That is a solvable engineering problem. It requires deliberate design, but it is not uniquely hard.

Hybrid infrastructure is not the consolation prize for organisations that haven’t yet moved to cloud. For many of them, it is the architecture they will be operating for the foreseeable future, and the organisations that treat it as such, invest in it as such, and build the operational model to match will be the ones that operate it reliably.