There was a time when data center networks were just the plumbing – as long as packets flowed, we were happy. Most of the attention went to expanding CPU capacity, adding more storage, or later, managing VMs and containers. Network traffic was largely about users coming into the data center or data going out (classic North-South flows). But with the rise of modern AI, those days are gone. In today’s AI boom, the network has taken on a starring role, especially East-West traffic, the data is now flowing between servers within the data center.
AI’s Network Appetite: East-West Takes Center Stage
The explosive growth of AI (think large language models, recommendation systems, massive analytics) has supercharged the volume of East-West traffic inside data centers. In fact, lateral East-West traffic now far exceeds the traditional North-South traffic that goes in and out of the data center. Some industry analyses find that over 70–90% of data center traffic stays inside the data center (East-West) rather than coming from or going to external clients. This is a dramatic shift! It means the biggest networking challenge today isn’t just handling users downloading videos or requesting web pages; it’s handling servers blasting data to other servers.
Why such a shift? One big reason is modern AI workloads. Training a big AI model (like a GPT-style language model) isn’t done on one machine – it’s spread across dozens or hundreds of servers, all working in parallel. Those servers are constantly exchanging huge volumes of data (model parameters, gradients, data shards) every second. This server-to-server chatter is pure East-West traffic. If the network inside can’t keep up, the entire AI job crawls. Traditional network designs buckle under this load: AI tasks involve tightly coupled, high-bandwidth data exchanges that older Ethernet networks weren’t designed to handle. In short, AI has turned the data center network into a critical performance bottleneck; or, if done right, a competitive advantage. The network is no longer just a cost center; it’s a make-or-break factor for AI-driven organizations.
To put it in perspective, consider that NVIDIA’s DGX B300 GPU (a workhorse for AI) has an aggregated bandwidth of 14.4 GB/s using special in-system interconnects(NVLink). Feeding data at that rate across a network requires ultra-fast fabric. In many hyperscale AI deployments (like large public cloud GPU clusters), operators resort to dedicated high-performance networks like InfiniBand to get the low latency and high throughput needed. In other words, the internal network has to operate at supercomputer levels. This is something quite new for data centers outside of specialized HPC labs.
North-South vs. East-West: Understanding the Difference
So what exactly do we mean by “North-South” and “East-West” in a data center? These terms borrow from the compass to describe direction of data flow:

Source: NVIDIA
- North-South traffic is the in-and-out flow. It’s data going northbound from inside the data center out to the wider world (to users, clients, or other data centers) or coming southbound from outside into the data center. For example, when you load a website, your request and the server’s response are North-South traffic; entering from the “north” (user side) to the “south” (server), then back out.
- East-West traffic is the side-to-side flow within the data center. This is server-to-server or server-to-storage communication inside the facility. For instance, when one microservice on Server A calls another service on Server B, or when dozens of GPU servers exchange data during AI training, that’s East-West. It’s often called “lateral” traffic because it moves horizontally across the network fabric rather than going up and out of the data center.
In traditional network diagrams, you often see core routers and internet gateways at the top (north) and servers at the bottom (south), so it’s easy to visualize North-South traffic as vertical. East-West traffic, by contrast, flows through the network fabric connecting servers (often through top-of-rack switches to spine switches), which is like moving horizontally from one part of the data center to another.
The NVIDIA’s figure above illustrates this: notice how North-South flows have a natural choke point at the data center’s perimeter (where things like firewalls or load balancers might sit), whereas East-West flows can traverse many possible paths across a mesh of switches inside. This distributed nature of East-West traffic is why it’s both a blessing (lots of bandwidth potential) and a curse (harder to manage and secure).
The short version: North-South = client-server traffic; East-West = server-server traffic. North-South traffic is typically what leaves the building or enters it; East-West stays within the walls. Modern applications – especially cloud and AI – have massively increased the East-West portion of the mix. Instead of most bits heading in/out to users, most bits now bounce around between servers. That forces us to rethink how we design and equip our networks.
What about Storage I/O?
Now, this is an excellent question, is it storage I/O East-West or North-South? I had the opportunity to discuss this during an NVIDIA training.
Even if the storage arrays sit in the same rows as your GPU racks, the heavy read/write flows between GPU nodes and back-end storage are treated as North-South. Best practice (and NVIDIA’s guidance) is to wire each server into two logically separate fabrics: The first, a lossless, low-latency East-West compute fabric reserved for GPU collectives, and a second one with a BlueField-DPU-guarded North-South fabric for data ingestion, checkpointing, backups, and external client access. Offloading storage bursts to the North-South plane keeps GPU lanes clean, preserves training scalability, and makes security/QoS policies easier to enforce at the edge. It’s also how most storage vendors recommend you do it, from the GPU workload’s point of view.
Traditional Data Centers vs. AI Era Architectures
It’s worth comparing how data center network patterns have changed from the past into today’s AI-driven era:
Traditional enterprise architecture
Think of a classic web application or a corporate IT environment. Traffic is often user-driven: a client makes a request that hits a front-end server, which maybe talks to a database, and results are sent back out. A lot of the traffic would go North-South, to and from users or between tiers of an app.
East-West traffic certainly existed (e.g., backups, replication, service-to-service calls in a microservices app), but it wasn’t the majority of the load. Network designs reflected this – typically a three-tier architecture (access, aggregation, core switches) with some oversubscription.
The “pipes” between racks or across the data center could be narrower because one rack’s servers didn’t all talk to another rack’s servers at full blast most of the time. If you were an IT admin a few years ago, you cared a lot about your North-South bandwidth to the internet, and about perimeter security. East-West was mostly an internal affair, easier to handle and often not closely monitored.
Modern AI / distributed computing architecture
Now consider what happens in an AI training cluster. You might have dozens of servers, each with multiple GPUs, collaborating on one task. A user might only see the final result (which is relatively small), so North-South for the actual “answer” is minimal, but under the hood those servers are exchanging gigabytes per second among themselves (East-West) to coordinate the computation. The internal traffic dwarfs anything going out or coming in.
Essentially, the data center itself acts like one giant computer for AI, and the network is the bus that connects its components. Oversubscription becomes a big problem when every server suddenly needs to talk to every other at high speed. Modern AI data centers thus favor a Clos (leaf-spine) topology that can provide near non-blocking bandwidth so any server can talk to any other with minimal contention.

Operators are building fabrics with hundreds or thousands of 100-Gbps-plus links to make sure GPUs and distributed applications don’t starve. In reference architectures like DGX SuperPOD, a two-tier leaf-spine network using 51.2-Tbps switches can connect up to 16,000 nodes with full bisection bandwidth – an architecture tuned for East-West throughput.
Another way to look at it…
Traditional data centers scaled out services by replicating them (lots of North-South from clients to each server). In AI and cloud, we often scale by distributing one workload across many machines, which means those machines must talk amongst themselves constantly (East-West). The network has become like the nervous system of an AI supercomputer – absolutely vital.
In practice, many AI-oriented data centers end up with two networks: one for cloud control and user access (North-South), and a second internal compute fabric for GPUs/CPUs to synchronize and move data around (East-West). Try to run that compute fabric on “regular” Ethernet and you quickly find the limits; new ideas and new silicon are required.
New Networking Tech for an East-West World
So, how are engineers responding to these new demands? The good news: we’re in a golden age of data center networking. A whole class of technologies has emerged to boost East-West capacity, reduce latency, and offload network tasks. Here are the key players and where they shine:
SmartNICs and DPUs (Data Processing Units): Supercharged NICs with their own compute (CPU cores, ASICs) to offload packet processing, encryption, virtual switching, NVMe-oF, and more. In practice, DPUs benefit both North-South and East-West by accelerating data movement, but they’re often steered toward infrastructure and security roles – essentially the North-South duties in an AI data center. Many AI servers include a BlueField-class DPU to handle ingress/egress, encryption, and policy – freeing CPUs/GPUs to chew on AI data and giving you clean isolation at the NIC.
NVIDIA’s SuperNIC (for East-West): A purpose-built adapter for the GPU-to-GPU data tsunami in AI clusters. Think of it as a slimmed-down, high-octane cousin of the DPU: it doesn’t try to offload every cloud function; it focuses on moving data between AI nodes as fast as possible. A BlueField-3–based SuperNIC can drive 400 Gb/s with RoCE, add hardware packet reordering, and apply congestion control tuned for AI. Paired with Spectrum-class switches, it creates an Ethernet fabric that behaves a lot like InfiniBand – lossless and low-latency – for East-West traffic. The design goal is simple: one SuperNIC per GPU, scale linearly, and keep the network out of the way.
InfiniBand and high-speed Ethernet (RDMA): HPC has long used InfiniBand for ultra-low latency and RDMA. Many hyperscale AI clusters still do. Others want one fabric, so they lean on Ethernet with RDMA (RoCE) and smarter congestion control. Either way, the common thread is RDMA – getting data into a peer’s memory with minimal CPU in the path. If you’re building an AI cluster today, you’ll choose between proven InfiniBand (HCAs + Quantum-class switches) or Ethernet with RoCE and specialized NICs/DPUs. The target is the same: maximize East-West performance.
Inside the Server: NVLink and NVSwitch: Even within a single server, data movement is a first-class concern. NVLink and NVSwitch create an internal East-West fabric so eight GPUs can talk at hundreds of GB/s per link, with aggregate bandwidth around 14.4TB/s for the B300 in topologies like DGX. That ensures intra-node chatter doesn’t spill onto the inter-node network and become the bottleneck.
To tie it together, here’s a quick summary of what’s used where:

Conclusion: The Network is the Computer (Again)
It’s clear that in the AI era, the network has moved from the sidelines to center stage. We used to say “the network is the computer,” and that slogan feels literal in modern AI data centers; a huge portion of “compute” is effectively how fast we can shuffle data between compute elements. East-West traffic has become the lifeblood of GPU-Acclerated Data Centers, soon AI Factories. If the fabric can’t keep up, your GPUs will sit idle.
The silver lining: this challenge lit a fire under networking innovation. Ten years ago, who imagined 400-Gb/s per accelerator or AI-specific congestion control? Now we have DPUs that act like mini-computers and SuperNICs that make Ethernet behave like a high-performance fabric. Switch ASICs have blown past 50 Tb/s, and GPU clusters feel like one giant box. The once-humble NIC is now strategic.
Networks matter more than ever. If you’re building modern apps—AI training, real-time analytics, microservices at scale—you can’t treat the network like plumbing. It shapes your architecture and your performance. Thankfully, the toolbox is deep: DPUs to offload and secure North-South, SuperNICs and RDMA fabrics to unleash East-West, and NVLink/NVSwitch to keep intra-node chatter off the wire.
In the end, the line between “computing” and “networking” is blurring. To accelerate AI, you must accelerate the connections between compute nodes, not just the nodes themselves (as was the case in virtualization). East-West is no longer a side show; it’s the main event. Ask any team “what’s the network look like?” before you spin up that next training run—because the fabric might be the difference between merely good and world-class. In this era, your East-West highway has to be engineered as carefully as the engines (GPUs/CPUs) it connects.
Takeaway: Get your East-West game right, and you’re halfway to AI success. After all, an AI data center is only as strong as its weakest link—literally.