IP/Network Layer

Basic Understanding of IP#

The main function of the network layer is to enable communication between hosts, also known as end-to-end communication.

What is the relationship between the network layer and the data link layer?#

The function of MAC is to facilitate communication between two directly connected devices, while IP is responsible for communication transmission between two networks that are "not directly connected."

In the transmission of data packets over the network, the source IP address and destination IP address do not change during transmission (provided that NAT is not used), while only the source MAC address and destination MAC address keep changing.

Basic Knowledge of IP Addresses#

In TCP/IP network communication, to ensure normal communication, each device needs to be configured with the correct IP address; otherwise, normal communication cannot be achieved.
An IP address (IPv4 address) is represented by a 32-bit positive integer, and IP addresses are processed in binary by computers.
For convenience, humans use a dotted-decimal notation, which divides the 32-bit IP address into 4 groups of 8 bits each, separated by dots, and converts each group into decimal.

Classification of IP Addresses#

IP addresses are classified into 5 types: Class A, Class B, Class C, Class D, and Class E.

What are Class A, B, and C addresses?

For Class A, B, and C, they are mainly divided into two parts: the network number and the host number.

How is the maximum number of hosts for Class A, B, and C addresses calculated?

The maximum number of hosts depends on the number of bits in the host number. For example, for Class C addresses, the host number occupies 8 bits, so the maximum number of hosts for Class C addresses is: $2^8 - 2 = 254$
Why subtract 2?
Because there are two special IP addresses in the IP address space: one where the host number is all 1s and one where it is all 0s.

The host number all being 1s designates all hosts in a specific network, used for broadcasting.
The host number all being 0s designates a specific network.

Therefore, during allocation, these two cases should be excluded.

What is a broadcast address used for?

A broadcast address is used to send data packets between hosts that are interconnected within the same link.
When the host number is all 1s, it indicates the broadcast address for that network. For example, the binary representation of 172.20.0.0/16 is:
10101100.00010100.00000000.00000000
Changing all the host part of this address to 1s forms the broadcast address:
10101100.00010100.11111111.11111111
In decimal, this address is represented as 172.20.255.255.
Broadcast addresses can be divided into local broadcasts and direct broadcasts.

A local broadcast occurs within the same network. For example, if the network address is 192.168.0.0/24, the broadcast address is 192.168.0.255. Since this broadcast address's IP packets will be blocked by routers, they will not reach other links outside of 192.168.0.0/24.
A direct broadcast occurs between different networks. For example, a host with the network address of 192.168.0.0/24 sends an IP packet to the destination address of 192.168.1.255/24. The router receiving this packet will forward the data to 192.168.1.0/24, allowing all hosts from 192.168.1.1 to 192.168.1.254 to receive this packet (due to certain security issues with direct broadcasts, they are generally set not to forward on routers).

What are Class D and E addresses?

Class D and E addresses do not have host numbers, so they cannot be used for host IPs. Class D is often used for multicast, while Class E is a reserved classification that is not currently in use.

What is a multicast address used for?

Multicast is used to send packets to all hosts within a specific group.

Multicast uses Class D addresses, where the first four bits are 1110, indicating it is a multicast address, and the remaining 28 bits represent the multicast group number.
The range for multicast is from 224.0.0.0 to 239.255.255.255, which is divided into three categories:

224.0.0.0 to 224.0.0.255 are reserved multicast addresses, which can only be used within a local area network, and routers will not forward them.
224.0.1.0 to 238.255.255.255 are user-available multicast addresses that can be used on the Internet.
239.0.0.0 to 239.255.255.255 are locally managed multicast addresses, which can be used internally within a network and are only valid within a specific local range.

Advantages of IP classification

Whether for routers or hosts, when resolving an IP address, we check if the first bit of the IP address is 0. If it is 0, it is a Class A address, allowing us to quickly identify the network address and host address.

Disadvantages of IP classification

Disadvantage 1:
There is no address hierarchy within the same network. For example, a company may use a Class B address but may need to classify addresses based on production, testing, and development environments. However, this IP classification does not provide the functionality for hierarchical address classification, leading to a lack of address flexibility.
Disadvantage 2:
Classes A, B, and C face an awkward situation where they do not match well with real-world networks.

Class C addresses can only accommodate a maximum of 254 hosts, which is often insufficient for even a small internet café.
Class B addresses can accommodate too many hosts, with over 60,000 machines under one network, which is generally beyond the scale of most enterprises, leading to wasted addresses.

Both of these disadvantages can be addressed by CIDR (Classless Inter-Domain Routing).

Classless Addressing CIDR#

Due to many shortcomings in IP classification, a classless addressing scheme was proposed later, known as CIDR.
This method eliminates the concept of classified addresses, dividing the 32-bit IP address into two parts: the network number and the host number.

How are the network number and host number divided?

The representation is a.b.c.d/x, where /x indicates that the first x bits belong to the network number, and x can range from 0 to 32, making IP addresses more flexible.

Another way to divide the network number and host number is through the subnet mask, which means masking the host number, leaving the network number.
By performing a bitwise AND operation between the subnet mask and the IP address, the network number can be obtained.

Why separate the network number and host number?

Because two computers need to communicate, the first step is to determine whether they are in the same broadcast domain, i.e., whether the network addresses are the same. If the network addresses are the same, it indicates that the recipient is on the same network, allowing the data packet to be sent directly to the target host.
In the router addressing process, this is how the corresponding network number is found, allowing the data packet to be forwarded to the corresponding network.

How to perform subnetting?

As mentioned above, we can use the subnet mask to divide the network number and host number. In fact, the subnet mask also serves the purpose of subnetting.
Subnetting essentially divides the host address into two parts: the subnet network address and the subnet host address. The format is as follows:

IP address without subnetting: network address + host address
IP address after subnetting: network address + (subnet network address + subnet host address)

Assuming we subnet a Class C address with a network address of 192.168.1.0, using a subnet mask of 255.255.255.192 for subnetting.
In a Class C address, the first 24 bits are the network number, and the last 8 bits are the host number. According to the subnet mask, we can borrow 2 bits from the 8-bit host number to serve as the subnet number.

Public IP Addresses and Private IP Addresses#

In Class A, B, and C addresses, there are actually distinctions between public IP addresses and private IP addresses.

Who manages public IP addresses?
Private IP addresses are usually managed by internal IT personnel, while public IP addresses are managed by the ICANN organization, known in Chinese as "互联网名称与数字地址分配机构."

IP Address and Routing Control#

The network address portion of an IP address is used for routing control.
The routing control table records the network address and the address to which the next hop should be sent. Both hosts and routers have their own routing control tables.
When sending an IP packet, the first step is to determine the target address in the IP packet header, then find the record in the routing control table that has the same network address as that address. Based on that record, the IP packet is forwarded to the corresponding next router. If there are multiple records with the same network address in the routing control table, the one with the longest matching prefix is chosen.

Host A wants to send an IP packet with a source address of 10.1.1.30 and a destination address of 10.1.2.10. Since there is no matching network address for the destination address 10.1.2.10 in Host A's routing table, the packet is forwarded to the default route (Router 1).
Router 1 receives the IP packet and matches it against its routing table for a record with the same network address as the destination address. It finds a match and forwards the IP data packet to Router 2 at 10.1.0.2.
Router 2 receives the packet and similarly compares it with its routing table, finds a match, and sends the IP packet out through its interface at 10.1.2.1, ultimately forwarding the IP data packet to the target host via a switch.

The loopback address does not flow to the network.

The loopback address is a default address used for network communication between programs on the same computer.
Computers use a special IP address 127.0.0.1 as the loopback address. A hostname called localhost has the same meaning as this address. When using this IP or hostname, the data packet does not flow to the network.

IP Fragmentation and Reassembly#

When the size of an IP packet exceeds the MTU, the IP packet will be fragmented.
After fragmentation, the reassembly of the IP datagram can only be performed by the destination host; routers do not perform reassembly.

Basic Understanding of IPv6#

IPv6 addresses are 128 bits long, allowing for an astonishing number of assignable addresses. A joke goes that IPv6 can ensure that every grain of sand on Earth can be assigned an IP address.
However, in addition to having more addresses, IPv6 also offers better security and scalability, meaning that IPv6 can provide a better network experience compared to IPv4.
But because IPv4 and IPv6 are not compatible with each other, not only do our computers and devices need to support it, but network operators also need to upgrade existing equipment, which may be one reason for the slow adoption rate of IPv6.

Highlights of IPv6

IPv6 not only increases the number of assignable addresses but also has many highlights.

IPv6 can be automatically configured, allowing for automatic IP address assignment even without a DHCP server, making it truly plug-and-play.
The IPv6 header has a fixed length of 40 bytes, eliminating the header checksum and simplifying the header structure, reducing the load on routers and significantly improving transmission performance.
IPv6 has network security features to combat IP address spoofing and prevent line eavesdropping, greatly enhancing security.

Identification method for IPv6 addresses

IPv4 addresses are 32 bits long, represented in groups of 8 bits using dotted-decimal notation.
IPv6 addresses are 128 bits long, represented in groups of 16 bits, separated by colons ":".
If there are consecutive zeros, these can be omitted and replaced with two colons "::". However, an IP address can only contain two consecutive colons once.

Structure of IPv6 addresses

IPv6 addresses mainly include the following types:

Unicast address, used for one-to-one communication
Multicast address, used for one-to-many communication
Anycast address, used to communicate with the nearest node, where the nearest node is determined by the routing protocol
There is no broadcast address

Types of IPv6 Unicast Addresses

For one-to-one communication IPv6 addresses, there are three categories of unicast addresses, each with different valid ranges.

For unicast communication on the same link, without going through a router, link-local unicast addresses can be used, which do not exist in IPv4.
For unicast communication within an internal network, unique local addresses can be used, equivalent to IPv4 private IPs.
For communication over the Internet, global unicast addresses can be used, equivalent to IPv4 public IPs.

IPv4 Header vs. IPv6 Header#

Improvements in the IPv6 header compared to IPv4:

The header checksum field has been removed. Since checks are performed at the data link layer and transport layer, IPv6 eliminates the need for IP checks.
Fragmentation/reassembly-related fields have been removed. Fragmentation and reassembly are time-consuming processes; IPv6 does not allow fragmentation and reassembly at intermediate routers, which can only occur at the source and destination hosts, greatly increasing the speed of router forwarding.
The options field has been removed. The options field is no longer part of the standard IP header, but it has not disappeared; it may appear at the location indicated by the "next header" in the IPv6 header. Removing the options field makes the IPv6 header a fixed length of 40 bytes.

DNS Domain Name Resolution#

DNS domain name resolution allows DNS to automatically convert domain names into specific IP addresses.

Hierarchical relationship of domain names

Domain names in DNS are separated by periods, such as www.server.com, where the periods represent boundaries between different levels.
In a domain name, the further to the right, the higher the level.
The root domain is at the top level, with the next level being the top-level domain com, followed by server.com.
Thus, the hierarchical relationship of domain names resembles a tree structure:

Root DNS server
Top-level domain DNS server (com)
Authoritative DNS server (server.com)

Workflow of domain name resolution

ARP and RARP Protocols#

When transmitting an IP datagram, after determining the source IP address and destination IP address, the next hop's MAC address must be determined through the host's "routing table." However, the next layer of the network layer is the data link layer, so we need to know the MAC address of the "next hop."
Since the routing table of the host can find the next hop's IP address, the MAC address of the next hop can be obtained using the ARP protocol.

How does ARP know the other party's MAC address?

ARP determines the MAC address using two types of packets: ARP request and ARP response.

The host sends an ARP request via broadcast, which contains the IP address of the host whose MAC address is being sought.
When all devices on the same link receive the ARP request, they will unpack the contents of the ARP request packet. If the target IP address in the ARP request matches their own IP address, that device will insert its MAC address into the ARP response packet and return it to the host.

Operating systems typically cache the first MAC address obtained via ARP for future reference, allowing the corresponding MAC address for an IP address to be found directly from the cache.
However, the cache for MAC addresses has a certain expiration period; once this period is exceeded, the cached content will be cleared.

Do you know what the RARP protocol is?

The RARP protocol is used to find an IP address given a known MAC address. This is often used when connecting devices like printer servers or other small embedded devices to the network.
Typically, a RARP server needs to be set up to register the MAC addresses and their corresponding IP addresses. Then, when the device connects to the network:

The device sends a request message saying, "My MAC address is XXXX, please tell me what my IP address should be."
The RARP server receives this message and returns the information, "The device with MAC address XXXX has the IP address XXXX."

Finally, the device sets its IP address based on the response received from the RARP server.

DHCP Dynamic IP Address Acquisition#

Our computers typically obtain IP addresses dynamically through DHCP, greatly simplifying the tedious process of configuring IP information.

First, it's important to note that the DHCP client process listens on port 68, while the DHCP server process listens on port 67.
These are the four steps:

The client first initiates a DHCP discovery message (DHCP DISCOVER) as an IP datagram. Since the client does not have an IP address and does not know the address of the DHCP server, it uses UDP broadcast communication, with a broadcast destination address of 255.255.255.255 (port 67) and 0.0.0.0 (port 68) as the source IP address. The DHCP client passes this IP datagram to the link layer, which then broadcasts the frame to all devices on the network.
When the DHCP server receives the DHCP discovery message, it responds to the client with a DHCP offer message (DHCP OFFER). This message still uses the IP broadcast address 255.255.255.255 and contains information about the IP address, subnet mask, default gateway, DNS server, and IP address lease duration provided by the server.
After the client receives one or more DHCP offer messages from servers, it selects one server and sends a DHCP request message (DHCP REQUEST) to respond, echoing the configured parameters.
Finally, the server responds to the DHCP request message with a DHCP ACK message, confirming the requested parameters.

Once the client receives the DHCP ACK, the interaction is complete, and the client can use the IP address assigned by the DHCP server for the lease duration.
If the lease on the DHCP IP address is nearing expiration, the client will send a DHCP request message to the server:
If the server agrees to continue the lease, it will respond with a DHCP ACK message, and the client will extend the lease.
If the server does not agree to continue the lease, it will respond with a DHCP NACK message, and the client must stop using the leased IP address.
It can be observed that the entire DHCP interaction uses UDP broadcast communication.

If it uses broadcast, what if the DHCP server and client are not on the same local area network, and routers do not forward broadcast packets? Does that mean each network needs to configure a DHCP server?

To solve this issue, a DHCP relay agent was introduced. With a DHCP relay agent, IP address allocation across different subnets can be managed by a single DHCP server.

The DHCP client sends a DHCP request packet to the DHCP relay agent, which, upon receiving this broadcast packet, forwards it to the DHCP server in a unicast manner.
The server then responds to the DHCP relay agent, which broadcasts this packet back to the DHCP client.

Thus, even if the DHCP server is not on the same link, it can still manage and allocate IP addresses uniformly.

NAT Network Address Translation#

IPv4 addresses are in short supply, leading to the proposal of a method called Network Address Translation (NAT) to alleviate the exhaustion of IPv4 addresses.
This method translates both the IP address and port number.
In this way, only one global IP address is needed, and this translation technology is called Network Address and Port Translation (NAPT).

In the diagram, there are two clients, 192.168.1.10 and 192.168.1.11, communicating simultaneously with the server 183.232.231.172, both using local port 1025.
At this point, both private IP addresses are translated to the public address 120.229.175.121, but differentiated by different port numbers.
Thus, a NAPT router's translation table is generated, allowing the correct translation of address and port combinations, enabling clients A and B to communicate simultaneously with the server.
This translation table is automatically generated on the NAT router. For example, in the case of TCP, when the SYN packet is sent during the initial handshake of establishing a TCP connection, this table is created. It is then deleted from the table when a FIN packet is received confirming the closure of the connection.

What are the disadvantages of NAT?

Since NAT/NAPT relies on its own translation table, the following issues arise:

External entities cannot actively connect to NAT internal servers because there are no translation records in the NAPT translation table.
The generation of the translation table and the translation operations incur performance overhead.
During communication, if the NAT router restarts, all TCP connections will be reset.

How to solve the potential problems of NAT?

Switch to IPv6.
NAT traversal technology.
This allows network applications to actively discover that they are behind a NAT device, obtain the public IP of the NAT device, and establish port mapping entries for themselves, all of which are done automatically by applications behind the NAT device.

ICMP Internet Control Message Protocol#

ICMP stands for Internet Control Message Protocol.
In complex network transmission environments, network packets often encounter various issues. Therefore, messages need to be sent out to report what problems have been encountered, allowing adjustments to transmission strategies to control the overall situation.

What functions does ICMP have?

ICMP's main functions include confirming whether IP packets have successfully reached the target address, reporting reasons for IP packets being discarded during transmission, and improving network settings.
If an IP packet fails to reach the target address for some reason during IP communication, ICMP is responsible for notifying the specific reason.

In the example above, Host A sends a data packet to Host B, but for some reason, Router 2 in the path fails to detect the existence of Host B. In this case, Router 2 sends an ICMP destination unreachable packet to Host A, indicating that the packet sent to Host B was unsuccessful.
ICMP's notification messages are sent using IP.
Thus, the ICMP packet returned from Router 2 will follow the usual routing control, first passing through Router 1 before being forwarded to Host A. Upon receiving this ICMP packet, Host A will unpack the ICMP header and data field to learn the specific reason for the problem.

ICMP Types

ICMP can be roughly divided into two main categories:

One category is diagnostic query messages, known as "query message types."
The other category is error messages that notify of issues, known as "error message types."

Query Message Types#

Echo Message - Type 0 and 8

Echo messages are used between communicating hosts or routers to determine whether the sent data packets have successfully reached the other end; the ping command utilizes this message.

Ping#

Let's look at the sending and receiving process of ping.

When the ping command is executed, the source host first constructs an ICMP echo request message data packet.
Then, the ICMP protocol hands this packet along with the address 192.168.1.2 to the IP layer. The IP layer will use 192.168.1.2 as the destination address, the local IP address as the source address, set the protocol field to 1 to indicate it is ICMP, and add some other control information to construct an IP packet.
Next, a MAC header needs to be added. - ARP
When Host B receives this data frame, it first checks its destination MAC address and compares it with its own MAC address. If they match, it receives the packet; otherwise, it discards it.
After receiving, it checks the data frame, extracts the IP packet from the frame, and hands it over to its own IP layer. Similarly, the IP layer checks and extracts the useful information to pass it to the ICMP protocol.
Host B constructs an ICMP echo response message data packet, with the type field of the response packet set to 0, the sequence number matching that of the received request packet, and then sends it back to Host A.
If the source host does not receive the ICMP response packet within the specified time, it indicates that the target host is unreachable; if it receives the ICMP echo response message, it indicates that the target host is reachable.
At this point, the source host checks the current time minus the time the packet was initially sent from the source host to determine the ICMP packet's time delay.

Error Message Types#

Destination Unreachable Message - Type 3
When an IP router cannot send an IP packet to the target address, it returns a destination unreachable ICMP message to the sending host, indicating the specific reason for the unreachability, which is recorded in the code field of the ICMP packet header.
Source Quench Message - Type 4
In the case of using low-speed wide area lines, routers connecting to WAN may encounter network congestion issues.
The purpose of the ICMP source quench message is to alleviate this congestion.
When a router sends data to a low-speed line and its sending queue becomes zero and cannot send out, it can send an ICMP source quench message to the source address of the IP packet.
The host receiving this message understands that there is congestion at some point along the entire line, thereby increasing the transmission interval of IP packets to reduce network congestion.
However, since this ICMP message may cause unfair network communication, it is generally not used.
Redirect Message - Type 5
If a router discovers that the sending host is using a "non-optimal" path to send data, it will return an ICMP redirect message to that host.
This message contains the most suitable routing information and source data. This mainly occurs when the router has better routing information. The router will inform the sender through this ICMP message to send to another router next time.
Timeout Message - Type 11
An IP packet has a field called TTL (Time To Live), which decreases by 1 each time it passes through a router. When it reaches 0, the IP packet is discarded.
At this point, the router will send an ICMP timeout message to the sending host, notifying that the packet has been discarded.
The main purpose of setting the IP packet's lifetime is to avoid endless forwarding of IP packets in the network when routing control encounters problems.

Traceroute#

Function 1: Intentionally set a special TTL to trace the routers passed on the way to the destination.

How does this function work?

Its principle is to use the lifetime of the IP packet, starting from 1 and incrementing sequentially while sending UDP packets, forcing the reception of ICMP timeout messages.
For example, setting the TTL to 1 will sacrifice the packet at the first router, which will then return an ICMP error message indicating a timeout.
Next, setting the TTL to 2 will allow the packet to pass the first router but sacrifice it at the second router, which will also return an ICMP error message. This process continues until reaching the destination host.
Through this process, traceroute can obtain the IP addresses of all the routers.
Of course, some routers may not return this ICMP message, so for some public addresses, the intermediate routers may not be visible.

How does the sender know if the sent UDP packet has reached the destination host?

When sending UDP packets, traceroute fills in an impossible port number as the UDP destination port: 33434. For each subsequent probe, it increments this number, and these ports are generally considered unused. However, it is unknown what happens when certain applications listen on such ports.
When the destination host receives the UDP packet, it will return an ICMP error message, but the type of this error message will be "port unreachable."
Thus, when the error message type is "port unreachable," it indicates that the UDP packet sent by the sender has reached the destination host.
Function 2: Intentionally set "do not fragment" to determine the path's MTU.

Why do this?

The purpose is to discover the path MTU.
Sometimes we do not know the MTU size of the routers; the MTU on Ethernet data links is usually 1500 bytes, but the MTU value on non-Ethernet links may vary. Therefore, we need to know the MTU size to control the size of the packets sent.

Its working principle is as follows:
First, when the sending host sends the IP datagram, it sets the fragmentation flag in the IP packet header to 1. Based on this flag, the routers in the path will not fragment the large packet but will discard it instead.
Subsequently, an ICMP unreachable message will be sent back to the sending host, indicating the MTU value along the data link, with the message type being "needs to frag but DF set."
Each time the sending host receives an ICMP error message, it reduces the packet size to locate an appropriate MTU value to ensure it can reach the target host.

IGMP Internet Group Management Protocol#

Previously, we learned about multicast addresses, which are Class D addresses. Since multicast means that only a group of hosts can receive the data packets, while hosts not in the group cannot receive the packets, how do we manage who is in the group? This is where the IGMP protocol comes in.

IGMP is the Internet Group Management Protocol, which operates between hosts (multicast members) and the last-hop router, as shown in the blue part of the diagram above.

IGMP messages request routers to join and leave multicast groups. By default, routers do not forward multicast packets to connected hosts unless the hosts join the multicast group via IGMP. When a host requests to join a multicast group, the router records this in the IGMP router table, and subsequently forwards multicast packets to the corresponding hosts.
IGMP messages are encapsulated in IP, with the protocol number in the IP header set to 2, and the TTL field value is usually set to 1, as IGMP operates between the host and the connected router.

IGMP Working Mechanism

IGMP is divided into three versions: IGMPv1, IGMPv2, and IGMPv3.
Next, taking IGMPv2 as an example, we will discuss the mechanisms of general queries and responses, as well as leaving the multicast group.

General Query and Response Mechanism
Leaving the Multicast Group Mechanism
Situation 1: There are still hosts in the multicast group on the segment:

Situation 2: There are no hosts in the multicast group on the segment: