davidlyness [dot] com

Things I find interesting.


For the past year or so the internet community has been attempting to migrate (or at least attempting to prepare to migrate) from IPv4 to IPv6. The main reason for the transition is to enable - out of necessity - a larger IP address space. While IPv4 allocates 32 bits for an individual address, giving \(2^{32}=4294967296\) unique IP addresses, IPv6 allocates 128 bits, allowing \(2^{128}=340282366920938463463374607431768211456\) different addresses to be uniquely referenced.

However, IPv6 affords us many other benefits aside from the increased address space. While none of these may necessitate a transition by themselves, they are still welcome additions to the protocol. The following information is paraphrased from the IPv6 Wikipedia article.

Multicasting

IPv6 allows a packet to be transmitted to multiple recipients with a single operation, rather than having to specify each recipient individually (and use up bandwidth in the process). By sending a packet to the ff02::1 multicast group a packet will be transmitted to all hosts on the link.

Stateless address autoconfiguration (SLAAC)

ICMP - the Internet Control Message Protocol - and DHCP - the Dynamic Host Configuration Protocol - are also getting v6 version bumps with the introduction of IPv6. ICMPv6 and DHCPv6 allow automatic configuration of hosts connected to an IPv6 network. ICMPv6 allows stateless configuration, meaning that the local router will automatically respond to a multicast broadcast advertising its configuration parameters. If this is unavailable (depending on environment and application) DHCPv6 can be used as a fallback in much the same way as DHCPv4.

Mandatory network-layer security

Perhaps the most important additional enhancement, IPv6 now handles IPsec traffic as standard. IPsec, like HTTPS, enforces encryption and authentication of packets, but is protocol-agnostic. Although engineered into IPv4, IPsec will be mandatory for IPv6 connections and will hugely increase the security of all internet use.

Simplified processing by routers

Multiple enhancements have been made to simplify the process of packet forwarding and make it more efficient.
  • Due to the additional features of IPv6, the packet headers are on the order of twice as large as IPv4 headers. However, the layout has been simplified with fields that are rarely used being moved to optional header extensions (see "Options extensibility" below). For example, the Selective Directed Broadcast Mode option, providing unreliable UDP delivery to a set of IP addresses - which can range in size from 6 to 38 bytes - is rarely used and can be safely moved to the optional header extensions section of the packet.
  • There is no fragmentation of packets between routers over an IPv6 network. To ensure the correct MTU is set, hosts can perform end-to-end MTU discovery, end-to-end packet fragmentation, or use the default minimum of 1280 bytes. This eases the load on the hosts to reassemble fragmented packets, but should also mean that numerous VPN solutions work much more reliably.
  • The somewhat-redundant Internet-layer checksum has been removed. Since both the link layer (e.g. Ethernet) and transport layer (e.g. TCP) typically perform checksum calculations, it is very unlikely that such a checksum would detect an error not already caught at a higher or lower layer. This also eases the burden on intermediate IPv6 routers, as when the packet is altered due to the "time to live" (TTL) being decremented, they do not have to recalculate the checksum.
  • The TTL field has been renamed to "Hop Limit", an all round more intuitive term for what it represents.


Mobility

Mobile IPv6, i.e. IPv6 on mobile devices, is the IPv6 implementation that allows users to move from one network to another while maintaining a permanent (or at least sticky) IP address. (Otherwise, when dynamically switching from a 2G to a 3G network on a mobile device, any partially-downloaded files would become unusable, and voice / video calls would likely glitch as the transition is made.) Mobile IPv6 avoids the problem of triangular routing that plagues mobile IPv4.

Options extensibility

As noted above, optional header options are implemented as additional extensions separate from the main IPv6 header, which has a fixed size of 40 bytes. This allows the protocol to be completely extensible in the future, allowing quality of service, security and mobility features to be added without redesigning the base protocol.

Jumbograms

The most fun-sounding addition has been saved for last. IPv4 packets have a fixed maximum size of 65535 bytes. By specifying the "Jumbo Payload Option" header, IPv6 nodes can handle packets larger than this limit, known as jumbograms. This can increase performance over high-MTU links, but is also fun to say.


As discussed in a previous post on thwarting attackers who gain control of an unencrypted database, hashing is a useful technique which can be used to obfuscate passwords and other data that needs to be verified without storing the data in plaintext. However, as with any situation in which the source of the data is unknown, hashing algorithms on the server can be exploited. A vulnerability was recently given a lot of publicity where data sent using a normal HTTP request could tie up the server for several minutes to several hours.

A hash table is a common data structure used to store POST data sent to a web server. For example, whenever you comment on a post on this website, your name, email and comment are stored in a hash table as they are processed (in this case by PHP). It uses a hashing algorithm to map keys to values, using the hash as essentially an array index, the algorithm being specified by the implementation language. Alexander Klink and Julian Wälde, security researchers, were able to show that, if they could generate many strings that hashed to the same value, the hash table would degenerate to the point where 99% of CPU time was being used for the duration of the execution of the script.

The researchers describe the vulnerability in their paper.

If the language does not provide a randomized hash function or the application server does not recognize attacks using multi-collisions, an attacker can degenerate the hash table by sending lots of colliding keys. The algorithmic complexity of inserting n elements into the table then goes to O(n**2), making it possible to exhaust hours of CPU time using a single HTTP request.

The maximal POST request size is typically limited to 8 MB, which when filled with a set of multi-collisions would consume about four hours of CPU time on an i7 core. Luckily, this time can not be exhausted because it is limited by the max_input_time (default configuration: -1, unlimited), Ubuntu and several BSDs: 60 seconds) configuration parameter. If the max_input_time parameter is set to -1 (theoretically: unlimited), it is bound by the max_execution_time configuration parameter (default value: 30). On an i7 core, the 60 seconds take a string of multi-collisions of about 500k. 30 seconds of CPU time can be generated using a string of about 300k. This means that an attacker needs about 70-100kbit/s to keep one i7 core constantly busy. An attacker with a Gigabit connection can keep about 10.000 i7 cores busy.


By implementing a randomized hash function, as described above, the attacker cannot possibly force hash collisions on the scale required to tie up the web server. PHP uses the DJBX33 hash function, which has the property that if two strings collide under this function, then any other strings containing these as substrings collide too (this weakness is known as the "equivalent substring" property). This gives an attacker an effective means to compute hash collisions where otherwise a brute force attack would be the most effective route. The fact that most hash table algorithms are not cryptographically strong means that it can be trivial to reverse-engineer the algorithm to generate collisions.

This vulnerability isn't new: it was first published back in 2003, but only recently gained substantial attention by technology media, meaning that a fix had to be fast-tracked. And PHP was not the only scripting language affected: ASP.NET, Java, Python and Ruby, among others, are also subject to this vulnerability (although not all because of the "equivalent substring" weakness in their hashing algorithms). Microsoft have also released an emergency out-of-cycle patch addressing this vulnerability in their .NET framework.


In computing, a quine is a computer program which outputs a copy of its own source code. (See the Wikipedia page on quines for more information and examples.) I recently discovered a different sort of quine: mathematical ones.

Consider the function \[{1\over 2} < \left\lfloor \mathrm{mod}\left(\left\lfloor {y \over 17} \right\rfloor 2^{-17 \lfloor x \rfloor - \mathrm{mod}(\lfloor y\rfloor, 17)},2\right)\right\rfloor\] where
  • \(\lfloor x \rfloor\) is the floor function, i.e. the largest integer less than \(x\);
  • \(\bmod(b,m)\) is the modulo operation, which, when graphed over \(0\leq x\leq 106\) and \(n\leq y\leq n+17\) with n = 4 858 450 636 189 713 423 582 095 962 494 202 044 581 400 587 983 244 549 483 093 085 061 934 704 708 809 928 450 644 769 865 524 364 849 997 247 024 915 119 110 411 605 739 177 407 856 919 754 326 571 855 442 057 210 445 735 883 681 829 823 754 139 634 338 225 199 452 191 651 284 348 332 905 131 193 199 953 502 413 758 765 239 264 874 613 394 906 870 130 562 295 813 219 481 113 685 339 535 565 290 850 023 875 092 856 892 694 555 974 281 546 386 510 730 049 106 723 058 933 586 052 544 096 664 351 265 349 363 643 957 125 565 695 936 815 184 334 857 605 266 940 161 251 266 951 421 550 539 554 519 153 785 457 525 756 590 740 540 157 929 001 765 967 965 480 064 427 829 131 488 548 259 914 721 248 506 352 686 630 476 300.


This is known as Tupper's Formula (created by Jeff Tupper in 2001). It produces the following "graph".

quine picture

If you look closely, this graph is a bitmap which has been chosen out of all possible bitmaps of this size by the value of \(n\) specified above, making it not truly self-referential. A real (and complex) self-referential mathematical formula was created by Jakub Trávník earlier this year.