NET // FIELD MANUAL  ·  kartikeytripathi.in ↗
PROGRESS
0 / 5
BY KARTIKEY TRIPATHI ROLE AWS CONTAINERS ENGINEER FOCUS NETWORKING FOUNDATIONS EDITION 2026.05

Networking/
a field manual

Five modules, one mantra. Move from packets on a wire to VPCs in the cloud, with a debugging arsenal in between. Track your progress as you go — everything is saved locally to this device. Built by a Cloud Engineer at AWS — for engineers who already read kubernetes/kubernetes for fun.

01
// fundamentals

How the internet actually works.

Strip away the abstractions. An IP address is a mailing address, DNS is the phonebook, HTTP is the language two machines agree to speak, and a request is what travels through all of it. Build this mental model first — everything else hangs off it.

IP v4 / v6 DNS A/AAAA/CNAME HTTP 1.1 / 2 / 3 Client ⇄ Server URL anatomy Headers & status codes
// trace a request — click any step
02
// core concepts

Ports, protocols, packets & routing.

The middle layer. TCP guarantees order and delivery; UDP trades guarantees for speed. DNS resolution is recursive and cached at every hop. Routing is just “next hop?” asked over and over.

LayerWhat it doesOwns in packetYou debug with
L2 / LinkMoves frames across a physical/virtual networkMAC addressarp, ip link
L3 / NetworkRoutes packets between networksIP address, TTLip route, traceroute, ping
L4 / TransportEnd-to-end conversations (TCP/UDP)Port, sequence, flagsss, netstat, tcpdump
L7 / ApplicationHTTP, DNS, gRPC, the stuff humans care aboutHeaders, methods, bodycurl, dig, nslookup
 TCPUDP
DeliveryGuaranteed, ordered, retriedFire-and-forget, no order
Setup3-way handshake (SYN → SYN-ACK → ACK)None — just send
OverheadHigher (state, ACKs, retransmits)Lower (almost none)
Use whenHTTP, SSH, databases — correctness mattersDNS query, VoIP, gaming, metrics — speed matters
Failure modeSlowness, retransmits, RSTSilent loss, you don’t know
▶ MEMORY HOOK Ports under 1024 are privileged (root to bind). 22 ssh, 53 dns, 80 http, 443 https, 3306 mysql, 5432 postgres, 6379 redis, 6443 kube-apiserver, 2379-2380 etcd.
03
// devops-focused networking

The pieces in front of your app.

In production almost nothing talks to your container directly. Traffic lands on a load balancer, may pass through a reverse proxy, gets translated by NAT, and only then reaches your service. Each of these is a place a request can go wrong.

ComponentWhat it doesLives at layerReal-world example
Load Balancer (L4) Spreads TCP/UDP connections across backends; doesn’t see HTTP L4 AWS NLB, HAProxy (TCP mode)
Load Balancer (L7) Reads HTTP — routes by host/path, terminates TLS, rewrites headers L7 AWS ALB, NGINX, Envoy, Traefik
Reverse Proxy Sits in front of an app, can cache, compress, auth, route L7 NGINX, Caddy, Envoy
NAT Rewrites src/dst IP & port so private nets can reach the internet L3/L4 AWS NAT Gateway, your home router
Ingress Controller K8s pattern that programs an L7 LB from Ingress / Gateway resources L7 AWS LBC, NGINX Ingress, Istio
Service Mesh Sidecar proxies handle mTLS, retries, traffic-split between services L4 – L7 Istio, Linkerd, App Mesh
▶ CONTAINERS TIE-IN In EKS, an Ingress with target-type: ip programs the ALB to send traffic directly to pod IPs via VPC CNI — skipping the kube-proxy/Service hop. With target-type: instance the ALB hits a NodePort and lets kube-proxy do the second hop. Knowing which mode you’re in changes how you debug latency.
04
// debugging arsenal — most important

If you can trace traffic, you can fix it.

Reading about networking gets you understanding. Running these commands against a broken system gets you a paycheck. Tap any card to expand — each one shows the purpose, a real example, and the thing that usually trips people up.

ping+
Is the host reachable? Sends ICMP echo. The fastest way to confirm L3 connectivity — or notice that ICMP is blocked.
ping -c 4 8.8.8.8
ping -c 4 kubernetes.default.svc.cluster.local
If ping fails but curl works → ICMP is blocked, app is fine.
curl+
Talk HTTP / inspect the wire. The single most useful tool. Headers, timing, TLS, redirects — all of it.
curl -v https://api.example.com/health
curl -I https://example.com # headers only
curl -w "%{time_total}\n" -o /dev/null -s … # just timing
Add --resolve host:443:1.2.3.4 to bypass DNS — isolates whether DNS or the backend is broken.
traceroute+
Which hop is dropping packets? Sends packets with increasing TTL so each router on the path is forced to identify itself.
traceroute -n example.com
mtr -rwzbc 50 example.com # better: realtime stats per hop
* * * isn’t always failure — many hops just refuse to reply to ICMP/UDP probes.
ss / netstat+
What sockets are open on this host? Prefer ss — faster, modern, same flags. netstat on legacy boxes.
ss -tlnp # all listening TCP + pid
ss -tnp state established # who am I talking to?
ss -s # summary counts (great for SYN floods)
A pod that “won’t serve traffic” often isn’t actually listening. ss -tlnp ends the argument.
dig / nslookup+
What does DNS actually return? dig is the surgical tool; nslookup is what’s already installed everywhere.
dig +short A example.com
dig @8.8.8.8 example.com # ask a specific resolver
dig +trace example.com # follow the recursion
In EKS: dig svc-name.namespace.svc.cluster.local from inside a pod isolates CoreDNS issues.
tcpdump+
What’s actually on the wire? The court of last resort. Capture, then read in Wireshark.
tcpdump -i any -n port 443 -w cap.pcap
tcpdump -i eth0 -nn 'tcp[tcpflags] & (tcp-syn) != 0'
In EKS Fargate you can’t run tcpdump on the host — use a sidecar or VPC Traffic Mirroring.
ip+
The modern replacement for ifconfig/route. Inspect interfaces, addresses, routes, ARP — all in one tool.
ip addr
ip route get 8.8.8.8 # which interface & gateway is used?
ip -s link # rx/tx errors, drops
ip route get <ip> answers “why does my traffic leave the wrong interface?” in one line.
nc (netcat)+
Test a port without an app. Open raw TCP/UDP. The “is the security group letting me through” tool.
nc -zv db.internal 5432 # connect, report, exit
nc -lvnp 8080 # listener for testing
Pair with ss -tlnp on the other side — you’ve just proved L3+L4 from both ends.
▶ ESCALATION PATTERN When a customer reports “it’s slow / not working”, walk the stack in order: pingdignc -zvcurl -vtraceroute/mtrtcpdump. Each step rules out a layer.
05
// cloud networking — must for devops

How traffic flows inside the cloud.

A VPC is just a software-defined data center. Once you’ve internalized the VPC → subnet → route table → SG chain, every cloud network feels the same. Without it, every cloud network feels like a riddle.

// anatomy of a vpc
INTERNET 0.0.0.0/0 IGW VPC  10.0.0.0/16 PUBLIC SUBNET  10.0.1.0/24 ALB NAT GW route: 0.0.0.0/0 → IGW PRIVATE SUBNET  10.0.2.0/24 EKS node RDS route: 0.0.0.0/0 → NAT GW SECURITY GROUP (public) ingress tcp 443 from 0.0.0.0/0 ingress tcp 80  from 0.0.0.0/0 egress  all stateful · default deny SECURITY GROUP (private) ingress tcp 5432 from sg-public ingress tcp 22   from 10.0.0.0/16 egress  all references by sg-id > by CIDR
↑ ingress via IGW → ALB → pod  |  ↓ egress via NAT GW for outbound calls
ConceptMental modelCommon gotcha
VPC A virtual data-center with its own private IP range (CIDR). CIDR overlap kills VPC peering and TGW attachments. Plan early.
Subnet A slice of the VPC, tied to one AZ. “Public” just means the route table points to an IGW. A subnet isn’t public because of its name — it’s public because of its route table.
Route Table Per-subnet rule: “for destination X, send to gateway Y.” Missing 0.0.0.0/0 → nat-gw in a private subnet = mysterious egress timeouts.
Security Group Stateful firewall at the ENI level. Default deny inbound, default allow outbound. SGs are allow-only. There’s no deny rule. Use NACLs for explicit denies.
NACL Stateless ACL at the subnet boundary. Numbered rules, allow + deny. Stateless — you must allow ephemeral return ports (1024-65535) on outbound responses.
IGW vs NAT GW IGW = two-way internet. NAT GW = outbound-only for private subnets. NAT GW is per-AZ — one per AZ if you want HA and no cross-AZ data charges.
VPC Endpoint Private tunnel from your VPC to an AWS service (S3, ECR, STS) without internet. Without endpoints, every pull from ECR goes out the NAT GW. Look at your bill.
▶ EKS LENS VPC CNI assigns each pod a real VPC IP from the secondary IP pool of the node’s ENI. That’s why pod count per node is bounded by ENI/IP limits, and why SGs for pods works at all — the pod IS an ENI to the VPC.
// final rule
Don’t just read networking. Break things. Trace traffic. Fix issues.
∎  THE LAB IS THE CURRICULUM  ∎