IPv6

Tags: networking, tutorials.
By lucb1e on 2012-11-19 23:54:45 +0100

Since I couldn't find a clear all-you-need to know guide about IPv6, I'll attempt writing one.

For everyone who doesn't want to know everything about it, I wrote some shorter pieces.
What you need to know as...
- User
- Webdeveloper
- Gameserver admin

For everyone who is more interested:
I presume you have some basic understanding of how our current internet works. I won't bore you with the details of how a packet is built from bit to bit, nor will there be any step-by-step guides. I'll try to focus on how the new stuff in IPv6 works, which gives you enough info to seek out more on anything you want to experiment with.

Index
- What's so great about IPv6?
- Bits
- Address notation
- Types of addresses
- Special addresses and prefixes
- Security now that addresses are global
- Privacy considerations
- Credits / sources


What's so great about IPv6?
I can't seem to avoid telling you that there is a bigger address space. So now that's out there, let's move on to what you don't already know: It's more efficient and smarter than IPv4 in mainly three ways.

1) An IPv4 packet may get fragmented along the way which is quite tedious work for the router, while IPv6 detects (or tries to find out) the MTU on the path and prevents this. Chopping and rebuilding the packet is done by the source and destination, not the routers in between.

2) There is no checksum in the IP header anymore. Usually both a lower and higher layer handle this already: TCP has a checksum, Ethernet has an FCS, and everyone using UDP knows they should take care of their own validation (though it also has an optional checksum). So why bother giving IP yet another one? It only slows stuff down by a lot. Also these TCP and UDP checksums will cover the whole packet, not just their own header and payload.

3) IPv6 works with extensions which can be chained.
The header has a field at the end called "Next Header", which tells what's coming next. For example if there are no extensions and UDP is next, it will set the value to 17 to indicate that the next header is the UDP header, and that's the end of the IP header.

If it needs an Authentication Header extension, it will set the value to 51. Then at the end of the Auth Header there is another Next Header field, which it can set to another value (another extension or udp, tcp, icmp, you name it). If it's another extension, that too contains another Next Header field. This way, if no extensions are needed, the packet can be kept very short. And if you need 10 extensions, that's no problem either.

IPv4 had a variable header length between 20 and 60 bytes, IPv6 is fixed at a 40B length which makes hardware optimization easier.


Bits
128 bits! So many gazillion addresses! We're never going to run out of space!
Actually, half the number of bits is already reserved for your LAN.
"Wait, you're telling me they reserved 18446744073709551616 addresses for my grandmother's home network?" (That's 2^32 times more than the entire IPv4 space.)
Well, yes. Hey don't blame me, e-mail these guys.

It has some use though: With the rather limited IPv4 space, networks had to be designed in a way that they were both efficient (not using unnecessary space) and expandable in the future when needed. This huge space for the so-called interface ID allows you to simply work with big subnets without taking too much space.


Anyway, the address is split in a few parts:
- The first 32 bits form the minimum allocation size; you (as an ISP) can't get assigned fewer addresses than this. This minimum is to make the size of routing tables smaller, which might otherwise become huge).

- The ISP is then free to hand out whichever subnet it wishes. It used to be recommended to use /48 subnets, but this was superseded by another recommendation (RFC6177) which leaves ISPs more free.

- Usually, last 64 bits are for the LAN-segment, and are called the interface ID. This might be more, possibly even 80 bits or more. This is up to the ISP.


Address notation
8 hexadecimal blocks of 2 bytes each, separated by colons:
72AF:B342:0020:0000:0000:0000:B342:0020

Like with our own number system, leading zeros aren't written. You say I got 1 million, not 0001 million. The above address could be abbreviated to:
72AF:B342:20:0:0:0:B342:20

Lastly you can abbreviate this further, I think this is best explained by example:
72AF:B342:20::B342:20

The double colon inserts as many zeros as possible until the length of the address is the desired 128 bits. You can obviously use it only once, else it wouldn't know how many zeros to insert at one :: and how many on the other.

CIDR notation remains the same, you can still write 72AF:B342::/32.

Some applications (like many browsers) require square brackets around the address to separate the address from a port number: https://[7F93::1337]:443/

To map an IPv4 address in an IPv6 packet, you can write the address as:
::FFFF:192.168.123.123
Or in full:
0000:0000:0000:0000:0000:FFFF:192.168.123.123
With 45 characters this is the longest possible address—visually at least, it is still 128 bits of course.

Both of these are identical to:
::FFFF:C0A8:7B7B

Note that this doesn't work "out of the box" and can't be used without really configuring it. It's simply a recognized way of writing an IPv4 address in IPv6 notation.


Address types
There are no broadcast addresses anymore. Basically there are three kinds of addresses now:

- Unicast

- Multicast

- Anycast

Unicast is an address like we are used to it: One address assigned to one interface. The only difference with IPv4 is that most addresses will be global (or 'globally routable'), so the IPv6 address of your computer will be the address with which you reach this website instead of the address from your router. I'll talk more about this in a later chapter.

Multicast is going to play a slightly bigger role now that broadcast addresses are gone and all such traffic needs to go through multicast addresses. Any multicast address starts with FF::/8.
An example of a standard multicast address is FF02::2, which concerns all routers. You can see more of those in modern /etc/hosts files on Linux systems (or click here).

Anycast is somewhat new to IPv6. It looks the same as an unicast address, but it's assigned to multiple interfaces. When sending a packet to an anycast address, it will be delivered to only one interface (usually the closest one). All interfaces that own this address must be configured such that they all know the address is an anycast address.


Special addresses and prefixes
:: is the new 0.0.0.0, meaning "unspecified network." For example when configuring Apache you may somewhere use "listen 0.0.0.0:80". This will now be "[::]:80".

::1/128 is the new 127/8. In case you didn't get that: The entire 127.0.0.0 block is reserved for loopback addresses; pointing to your own computer. Commonly, 127.0.0.1 is the only address actually doing that. In IPv6 that's only ::1.

2001:db8::/32 is the official documentation subnet; nobody should use it in production networks.

2001:0::/32 is for teredo clients, a technique which enables IPv6 connectivity through IPv4 networks.

2002::/16 is for IPv6 to IPv4 addresses (also known as 6to4). 127.0.0.1 would be 2002:7F00:0001::/48 (7F is hexadecimal for 127). Note that this, like ::FFFF:127.0.0.1, also requires special configuration.

FE80::/10 addresses are link local, identical to 169.254.0.0/16. They go as far as the link goes; for example if you connected the interface via a cross-over cable to another computer, that is the link. If it is connected via a hub or switch, then everyone connected to that is part of the link. Internet routers should not forward these packets.

Site local addresses are in the FEC0::/10 range, only the definition of what a "site" is exactly was ambiguous and these addresses are depricated already.


"You mentioned something about 'globally routable'...?"
In the current IPv4 state, as you should know, there is usually a network behind public IPs. Usually this is in the private IP range 10.0.0.0/8 or 192.168.0.0/16, and it contains all computers, laptops, smartphones, tablets, TVs, printers and whatnot. They all comunicate beyond their network through a single public (globally routable) IP address.

This works via NAT, which is integrated into consumer routers and keeps a list of connections. If your laptop just sent out a TCP connection request to 173.194.66.102 (Google), the router will remember and nicely forward the connection acceptance (or refusal) back to the same laptop.

The way it does this is by port numbers. When sending the packet, a source port is set at which Google will reply, and by which the router can lookup in its table: "Let's see, port 62031... ah that's a connection with 192.168.1.104!".

Now imagine a random packet comes in:
"Let's see, port 80... huh I don't have that one in my list!"
In this case someone attempted to connect on port 80, the HTTP port. But none of the devices in the network sent a request with a source port set to 80! So the NAT router will drop the packet (simply do nothing with it). This, by the way, is where port forwarding comes into the picture: you tell the router to always forward packets from a certain port (like port 80) to one device (like a home server or your computer or so).

Now that IPv6 addresses are globally routable, this NAT conversion to switch between a local and global address is not needed anymore. The router will simply forward your packets, but not change the address inside.

The first thought popping into sysadmin's heads when they first hear of this is "But then everyone can reach my internal network!"
No. The NAT environment forces a default-deny policy on the router; if a packet wasn't recognized it had to be dropped. Now that the destination interface (as identified by the IPv6 address) is known, doesn't mean your firewall (which is also integrated in your consumer router) will blindly pass on all packets. It will still keep a list like it did with NAT, and let the right packets through, even though it doesn't technically need to. The destination is known, but the integrated firewall protects you just like NAT used to.

So everyone in the world may have a route to your address, hence it's routable, but that doesn't mean it's reachable. You may even configure the firewall that it doesn't allow any traffic whatsoever coming from the internet towards a certain IP, even if that IP initiated traffic first.


Privacy enhancements
One more thing about those addresses: by default, they'll usually change every few hours. With IPv4 a client usually tries to keep the same (local/private) address as long as possible, even between reboots. For privacy considerations (the new addresses are global), this is no longer the case.

Actually this is partially due to a misunderstanding about including the MAC address in the global IPv6 address, but this may be used, it's not that it must be used. If it was used it would mean you could be tracked cross-ISP and without needing cookies or anything, but this was partially a misunderstanding. Anyway, now it's designed such that the addresses change regularly. Windows, Mac OS X, iOS and some distributions of Linux have this enabled by default.

Of course, like with IPv4, you can set an address to be static instead of changing every few hours. This works in the same way as IPv4; simply set the address you want in the interface's properties (as shown here).


What you need to know...
As user

Nothing at all!
If everything goes according to plan, you will not notice the change. You might need a new modem some time and in the worst case someone needs to come fix something on your computer, but that's the extent of it.


As webdeveloper
It is good to be familiar with the address notation, you will come across it some time.
Users are usually uniquely identified by the first 64 bits of their address (the first 4 blocks of hexadecimals). Anything behind that probably changes every few hours. If you want to ban somebody, it's always good to start by blocking the address:
2001:980:1f44:c66d:a3:fa9:dea:9fev

But if they persistently keep changing his or her IP, you might want to block this subnet instead:
2001:980:1f44:c66d

As a last resort, you can keep increasing what subnet is blocked, until you've entirely disabled the ISP by blocking this:
2001:980:1f44

Addresses can have a maximum length of 45 characters, everything considered. You should reserve this much space for displaying it. Storing it in the database should be done in a binary format (which is 128 bits long), but PHP has no built in conversion functions for this yet. On the ip2long page you can ctrl+f for IPv6 though and currently the first result is an ip2binary function.


As gameserver admin
The only things changing for you is probably your server's IP and how you ipban people; so it's good to be familiar with the address notation.

Banning someone works the same, just ban the address, only if someone keeps changing their IP you might want to block the entire /64 subnet. This is probably what identifies the home connection uniquely, though it might even be up to /48.
For example, someone uses this address:
2001:980:1f44:c66d:a3:fa9:dea:9fev

If the player keeps changing the IP, you might want to ban this part:
2001:980:1f44:c66d::/64

As a last resort you can block larger and larger subnets, until you've entirely disabled the ISP by blocking this:
2001:980::/32


Sources
Many bits and pieces, but one I wanted to explicitly name was the book IPv6 Essentials by Silvia Hagen (O'Reilly). It really cleared a lot up after reading lots of different things from lots of different sources.

Thanks to Romain Boissat for recommending that book, reading a draft of this post and providing a lot of feedback.

Also thanks to agwa and diminoten for commenting on Hacker News and contributing corrections.

Note that despite other people's reviews and my best efforts, there may still be errors in this article. If you spot any, feel free to correct me!
lucb1e.com
Another post tagged 'networking': How does SSL work?

Look for more posts tagged networking or tutorials.

Previous post - Next post