Platform, Security, Workplace
Platform, Security, Workplace
Understanding how Azure Virtual Network design works is important whether you are lifting and shifting an existing application or datacenter to a cloud-native architecture from scratch. This how-to is written for anyone who wants to know a little more about azure networking and this guide will cover the most important topics of Azure networking explaining core concepts for those new to cloud networking while also covering details for more experienced engineers. Over the course of this guide we will draw clear comparisons to traditional on-premises networking, so you can see exactly what the differences are and where Azure aligns with what you already know and where the changes are.
If you are a experienced engineer in traditional data centre, cable managing a rack at midnight, arguing about VLAN ID’s or explaining to a manger why “just add another port” is not how it works, then cloud networking is going to feel familiar as well as slightly unsettling. See it like this; it’s like founding out the restaurant you love has switches to a digital menu, it still has the same food but everything else is different. If VLAN’s are just something you have heard and not familiar with, do not worry as we will cover the fundamentals in this guide.
What you will learn in Part 1: IP addressing and CIDR, traditional networking concepts (switches, routers, VLANs, firewalls), Azure Virtual Networks, address spaces, subnets, Network Security Groups, routing with UDRs, and VNet peering with hub-and-spoke architecture.
Before we dive into Azure let’s make sure we are familiar with the basics, because the concepts for traditional networking are exactly the same as the ones Azure has rebuild in software. Basically a network is nothing more than just a way for devices to talk to each other. In the physical world that means cables, switches, routers and healthy amount of labelling tape. However in Azure this means configuration files and API calls. You never touch a cable as Microsoft deals with all of that in their data centers and you get a clean software interface to define your network topology.
Each device that is on a network needs an address, the same way that every house needs a postal address so the postman knows where he or she can deliver the parcel. This is the same thing as on a network but that is an IP address. In modern networks we use IPv4 addresses which looks like this; 10.0.1.5. Which are four numbers seperated by dots and each between 0 and 255. This gives us about 4.3 billion possible addresses globally.
This might sound great, and you think that sounds plenty but i can asure you it is not as we ran out of public IPv4 addresses a while ago.
There are two flavours:
starts at 10.0.0.0 up to 10.255.255.255starts at 172.16.0.0 up to 172.31.255.255starts at 192.168.0.0 up to 192.168.255.255
Most of the time home networks are using the 192.168..x.x. range while at corporate networks tend to favor 10.x.x.x. due to the fact that it can contain 16 million addresses which makes it the most roomiest block for large networks. This makes it easy to divide that block into lots of subnets across sites, VLAN’s and cloud environments without running into issues. Azure VNets use private ranges too, and it is your job to pick one. Further in this article we will come back to why this is an important choice to make.
Rather than listing every IP address in a network individually, we use CIDR notation to describe a range. It looks like this: 10.0.0.0/16. The number after the slash is the prefix length, it tells you how many bits of the address are fixed (the network part), and how many are free to vary (the host part). IPv4 addresses are 32 bits total, so:
| CIDR | Fixed Bits | Free Bits | Total Addresses | Usable Hosts |
|---|---|---|---|---|
| /8 | 8 | 24 | 16,777,216 | ~16.7M |
| /16 | 16 | 16 | 65,536 | 65,534 |
| /24 | 24 | 8 | 256 | 254 |
| /28 | 28 | 4 | 16 | 14 |
| /30 | 30 | 2 | 4 | 2 |
You subtract 2 from total addresses because the first address (network address) and last address (broadcast) are always reserved. In Azure you always have to account for 5 addresses that you can never use, so in each subnet you create you have to subtract 5 addresses that cannot be used. We will explain more on this when we come to subnets.
For example; when you pick a /16 network it gives you a large virtual network with room to grow unlike a /24 gives you a nice comfortable subnet to have enough room in most scenarios where a /28 subnet is tiny and can be used for services like a gateway.
Before we going into Azure specific network concepts, it helps to understand the core concepts of traditional networking. Even if you never touched a firewall it will help allot if you understand these concepts and will make everything in Azure networking much easier. To be honest, i don’t consider myself a network engineer but I do have experience in networking from the past when I was working with Juniper, Cisco and fortigate firewall and switches and this helps me allot today in Azure networking.
In a traditional network, a switch connects multiple devices on the same network segment, operating at Layer 2 of the OSI model. Layer 2 deals in MAC addresses, hardware identifiers associated with every network interface. When a switch receives a frame, it checks its MAC address table to find which port the destination belongs to and forwards accordingly. If it does not yet know the destination, it floods the frame out of every port and learns from the reply. Once the table is populated, all of this happens at wire speed.
In Azure: there are no switches you will ever see or configure. Azure’s underlying infrastructure handles all Layer 2 forwarding internally. From your perspective, resources in the same subnet can talk to each other, and that is all you need to know. The switch is Microsoft’s problem.
A router connects different networks together. Where a switch moves traffic within a network, a router moves it between networks. It operates at Layer 3, dealing in IP addresses rather than MAC addresses.
Routers maintain routing tables, lists of “if the destination is in this IP range, send the traffic out of this interface.” Every router on the internet participates in protocols like BGP (Border Gateway Protocol) to share routing information with each other. This is how a packet from your laptop in Amsterdam finds its way to a server in Singapore, hundreds of routing decisions, made in milliseconds.
in Azure there are no routers, you configure directly. Azure creates system routes automatically for every subnet, enabling traffic within the VNet, between peered VNets, and to the internet. When you need to override those routes, you do it with User Defined Routes (UDRs), basically a custom routing table you attach to a subnet. Same concept as a static route on a router, just configured in software.
In a traditional datacenter a network fabric is divided into VLAN’s, spanning multiple physical switches via trunks links into isolated broadcast domains. For example; traffic on VLAN 10 cannot reach VLAN 20 without pushing the traffic trough a router or a layer 3 switch. This kind of segmentation is done by a switch configuration. Managing VLAN’s at a large scale is fun right…? You tag ports, configure trunk links to have multiple VLAN’s between different switches and to top it off, you keep a spreadsheet of VLAN ID’s, and then someone decides that VLAN 200 should now be VLAN 169 and you spend a week fixing it.
In Azure, VLAN’s do not exist at the level we have access to. Azure replaced the entire VLAN concept with subnets within Azure Virtual Networks and this should be easier to maintain, but it does not automatically make Azure networking simple. Instead of tagging switch ports, you define a subnet in software and assign it a CIDR range and deploy resources into it. The isolation is enforced automatically. You don’t configure ports, no VLAN ID spreadsheet, no switch configuration at 2AM. However, you still need proper oversight and tooling to keep Azure networking under control as this can become a mess real quick if you don’t do it the right way..
A Firewall decides with traffic is allowed In and out of a network or segment. A traditional firewall sits at the perimeter, between the internet and your own internal network inspecting packets and allow or denying them bases on rules you have set.
Physical firewall range from expensive enterprise appliance to more affordable SME devices, but they share a common architectural assumption; security lives at the perimeter, If an attacker gets past a firewall, via phishing, stolen VPN credentials, or an insider threat then they often find a flat Internet network where everything can talk to each other. In azure, security is not just at the perimeter. NSG’s apply rules at the Subnet or NIC level, and Azure Firewall adds centralized inspection on top.
But this only works if you configure it the right way. In my opinion in an Azure Hub and Spoke environment you need to use an Azure Firewall as NSG’s on subnet level’s to have the maximum protection (of course you can choose a dedicated appliance from Palo Alto for example). Azure defaults are permissive within a Virtual Network. So if something gets passed the Azure firewall it hits the NSG’s before it reached the endpoint and if you have multiple subnets, lets say front-end, midtier and backend and you have placed a webserver in frontend and database in backend then by default there is no way a web server can reach the database as it lives in a different subnet (if all traffic is controlled by firewall). This is fundamentally healthier than a perimeter-only security.
The following table shows how familiar on-premises concepts map to their Azure equivalents. The honest summary: the concepts are the same. The implementation is completely different.
| Traditional Concept | Azure Equivalent | What Changed |
|---|---|---|
| Physical NIC + cable | Virtual NIC (vNIC) | Software-defined, no cable |
| VLAN | Subnet | Defined in code, no switch config |
| Physical switch | Managed by Azure (invisible) | Not your problem anymore |
| Router / L3 switch | System routes + UDRs | Configured as policy, not on a device |
| VLAN ACL / port ACL | Network Security Group (NSG) | Stateful, applied per subnet or NIC |
| Perimeter firewall | Azure Firewall / NVA | Centralised, managed service |
| Private WAN / MPLS | ExpressRoute | Private dedicated circuit, same concept |
| Site-to-site VPN | VPN Gateway | IPSec/IKE, same protocols |
| IPAM tool / spreadsheet | Address spaces on VNets | Built into the platform |
| Data centre = one location | Azure Region | Global, multiple regions |
| OSPF/BGP between routers | BGP on gateways; internal automatic | You only touch BGP at hybrid boundaries |
If you understand traditional networking deeply, Azure will make sense quickly. If you do not, Azure might seem like magic — and magic you do not understand is just a bug waiting to happen.
Your Virtual Network (VNET) in Azure is the foundational building block of all networking in Azure. Everything networking related in Azure lives within or connects to a VNET. A Virtual Network is your own private, isolated network in Azure. Resources inside this network can communicate with each other by default (subject to NSG rules). This means that nothing from the outside can reach in and nothing from the inside can leak out, unless you explicitly configure it.
Think of if it like this; if your entire on-premise datacenter is destroyed and rebuilt in software then all the switches, routers, cables and patch panels is replaced by Azure Virtual Network
Cost note: A virtual network on its own is free as you pay for resources that you place inside it and for certain features like gateway and firewalls. There is no excuse for not planning your virtual network properly before deploying anything because it’s hard if impossible to change afterwards!
When you create a virtual network in Azure the first decision that you have to make is: what is the address space going to be? This CIDR block defines all possible IP addresses within your virtual network. Everything you deploy inside must have IP’s within this range. It is important to choose wisely due to the next 3 reasons:
10.0.0.0/16 and you have your on-premises network on 10.0.0.0/16, you cannot connect them. Azure does not know which 10.0.1.5 network you mean. You probably find this out at the exactly wrong moment….
Below an example what is for most organizations a good place to start.
If the VNet is your city, subnets are the districts. Each subnet is defined in a range of IP’s that are carved from the Virtual Network address space. When resources are deployed into a subnet will get an IP address for that range by default. The first ip adres you are going to get is .4 in each range.
When you design subnets within a VNET it is important to keep in mind to keep a consistent baseline as this makes your network predictable and easier to secure and simpler to hand off to others. Every virtual network should start with a three tier foundation:
in Azure there are 5 IP addresses reserved in each subnet, not 2 like in traditional networks.
| Address | Reserved For |
|---|---|
| x.x.x.0 | Network Address |
| x.x.x.1 | Default Gateway |
| x.x.x.2 | Azure DNS |
| x.x.x.3 | Azure DNS (future use) |
| x.x.x.255 | Broadcast |
In a /24 subnet gives you 251 usable IP addresses, while a /28 gives you 11 out of 16. A /29 subnet has 8 addresses which 3 are usable which gives you technically enough room for exactly one thing.
| Subnet Name | CIDR | Purpose | Usable Hosts |
|---|---|---|---|
| snet-web | 10.1.1.0/24 | Frontend / web tier | 251 |
| snet-app | 10.1.2.0/24 | Application / API tier | 251 |
| snet-data | 10.1.3.0/24 | Database tier | 251 |
| snet-mgmt | 10.1.4.0/27 | Bastion, jump servers | 27 |
| snet-integration | 10.1.5.0/27 | App Service / service integration | 27 |
| GatewaySubnet | 10.1.255.0/27 | VPN or ExpressRoute gateway | 27 |
| AzureBastionSubnet | 10.1.254.0/26 | Azure Bastion (if used) | 59 |
Important: You can name the subnets as you like but couple subnet names are mandatory and cannot be changed. In the example above that is GatewaySubnet and AzureBastionSubnet. Microsoft won’t let you change this.
In a traditional network a VLAN is configured on a network switch and traffic between VLAN’s must pass through a router. The switch enforces isolation in hardware. When you want to change a VLAN configuration this means touching switch configs across potentially multiple devices, and often with immediate production impact.
While in azure, a subnet is a definition in software. Traffic between subnets in the same virtual network is automatically routed. You are in control what is allowed between subnets using NSG’s. When you change subnet security rules goes fairly fast and takes only seconds to change rules and can be done without any downtime, you can add and remove NSG rules while services are running.
You dont need a maintenance window for NSG rule changes, that is one of the genuinely nicest things about Azure networking and the kind of thing that makes on-premises network engineers simultaneously relieved and slightly redundant-feeling.
A Network Security group, also called a NSG is a set of security rules that controls inbound and outbound traffic to and from resources in your virtual network in Azure. The NSG is stateful and means if you allow traffic inbound, the response traffic is automatically allowed back out on that same connection, without an operate outbound rule. This alone is an advantage of classic router ACL’s, which are stateless and require mirrored rules in both directions.
| Priority | Rule | Effect |
|---|---|---|
| 65000 | AllowVnetInBound | Allow all VNet traffic inbound |
| 65001 | AllowAzureLoadBalancerInBound | Allow health probes from Azure LB |
| 65500 | DenyAllInBound | Deny everything else |
The DenyAllInBound at 65500 is why new subnets with no custom NSG rules still block internet traffic by default. Good default — less good when you spend 20 minutes wondering why your VM is not responding and then realise you forgot to open port 443.
This is a nice feature to use because instead of manually tracking and updating lists of Microsoft IP ranges , which Microsoft changes often, NSG rules support Service Tags. Service Tags are maintained by Microsoft and contains all IP prefixes for that specific service. Common ones include:
You can write a rule that says “allow HTTPS from the Internet to this subnet” without hardcoding a single IP address. When Microsoft changes their infrastructure IPs, your rule stays valid. This sounds like a small thing until you are not manually updating firewall rules at 11pm.
A traditional perimeter firewall sits between your internal network and the outside world. Often you just have one firewall and all your traffic is going trough that single firewall, it is powerfull but it is a choke point and it becomes expensive to scale.
NSGs are distributed. This means that every subnet and NIC can have its own NSG. This means that you are not just securing the permiter but you are securing every internal boundary to. A compromised web server cannot directly attack your dataserver server if the database subnet’s NSG denies SQL traffic from anything except the application subnet. This is a fundamentally better security posture than saying: “When it is inside our internal network, we always trust each other”
Limitation to know: NSGs are Layer 4 only, they understand IP addresses and ports, but not application-layer content. For deep packet inspection, FQDN-based rules, TLS inspection, or IDPS, you need Azure Firewall (covered in Part 2).
When you are going to deploy resources in Azure Subnets then they can talk to each other, the question is how does Azure know where to send the traffic? The answer to that question is system routes which Azure creates by default for every subnet.
| Address Prefix | Next Hop |
|---|---|
| VNet address space | VNet local (stays within VNet) |
| 0.0.0.0/0 | Internet |
| On-prem ranges (if gateway exists) | VNet gateway |
Most of the time, this is all you need. This means that traffic within the VNET routes locally and traffic to the internet goes to the internet. Simple, just like that.
The trouble comes when you dont want traffic to go directly to the internet because you want to control and identity the traffic from a centralized firewall before going to the internet or you want traffic between two spoke VNET’s to route trough a centralized hub. This is where User Defined Routes (UDRs) come in.
A UDR is a custom route that you can add to a Route Table which is associated with a subnet or multiple subnets and it overrides the system routes for that specific purpose/route.
Example: force all internet-bound traffic from your app subnet through Azure Firewall:
Route Table: rt-spoke-app Associated with: snet-app Route name: force-internet-to-firewall Address prefix: 0.0.0.0/0 Next hop type: Virtual Appliance Next hop IP: 10.0.0.4 (Azure Firewall's private IP)
Now nothing in snet-app can reach the internet directly. All traffic hits the firewall first.
Available next hop types:
UDR is identical the same in concept to static routes on a physical router. The difference is that you are setting it in a portal or IaC template rather than typing ‘up route’ into a Cisco CLI. the intent is the same but the method used is entirely different.
Having a single VNET is great for one or multiple workloads, however, most organization have multiple VNET’s to due multiple subscriptions. A Hub VNET for connectivity, spoke vnets for infrastructure and management and so on. VNET peering creates a direct private connection between two virtual networks over Microsoft’s backbone network. Traffic never touches the public internet and this means that latency is extremely low. Bandwith is limited only by the VM sizes you choose. You can peer:
Peering is configured on both virtual networks. You add a peering on VNet A pointing to VNet B, and another on VNet B pointing to VNet A. It is bidirectional, but you have to set it up from both sides. Forget one side and nothing works, a fun troubleshooting experience.
Critical point: peering is not transitive. If VNet A peers with VNet B, and VNet B peers with VNet C, then A and C cannot talk to each other via B. You either peer A directly with C, or you route traffic through a hub, which is the whole point of the hub-and-spoke pattern.
The hub-and-spoke topology is the go-to architecture for most enterprise Azure deployments:
This gives you a single inspection point for al internal traffic between spooks and traffic that goes to the internet without the cost and complexity of peering each spoke to each other, which would be really hard to manage!
Traditional equivalent: VNet peering is similar to a direct dark fibre connection or a dedicated private interconnect between two data centres — private, fast, and not going via the internet.
If you are still reading this and made it this far, you have covered a significant amount of information. Here is the essential version:
Security & Private Connectivity in Part 2
Load Balancing in Part 3
Hybrid Connectivity & Operations in part 4
The mindset shift is this: Traditional networks are built on hardware you manager while Azure networking is built on configuration you define. The concepts are the same like routing, segmentation, access control and connectivity but the implementation is entirely different.
The best Azure network engineers are not the ones who have forgotten how physical networking works. They are the ones that udnerstand it good enough to to adapt to a different way of working in networking and what auzre has replaced and what it has improved and where the underlying principles still apply.
Now go build something.
This article is part of the Azure Networking series on larsschouwenaars.com. Read more networking articles here.