Advanced Linux Networking

Using iproute2

The iproute2 package ships with most Linux distributions, often under the name iproute . You can also obtain it from ftp://ftp.inr.ac.ru/ip-routing/, its official home. This package includes several commands, two of which are covered here: ip and tc .

Using ip

The iproute2 command that's used for manipulating routing tables and rules is ip . This program relies on several of the suboptions of IP: Advanced Router in the kernel configuration, as described earlier. The program is used as follows :

ip command [list add del] selector action

You can specify any of several commands. One of the most important of these is rule . You can use this command to add ( add ), delete ( del ), or display information on ( list ) specific routing rules. You specify a rule with the selector , which itself is composed of several items:

[from addr ] [to addr ] [tos TOS ] [dev device-name ] [pref number ]

The from and to elements allow you to specify IP addresses, tos lets you specify a TOS value (which is a number, such as 4 ; this requires a kernel option that's described shortly), dev specifies the network device name (such as eth0 ), and pref signifies a preference number. These items collectively tell Linux how to identify packets to which a given rule applies. The ip rule command links these to an action , which has several components :

[table table-id ] [nat address ] [prohibit reject unreachable]

The table-id is a number identifying a particular routing table, nat lets you specify a new source address for the packet, and prohibit , reject , and unreachable are codes to indicate various methods of completely rejecting the packet.

Putting this all together, you might enter an ip command that resembles the following:

# ip rule add from 172.20.24.128 dev eth0 table 2

This rule tells the system to use routing table 2 for all traffic from 172.20.24.128 on eth0 . What, though, is routing table 2? An ordinary Linux installation uses the route command to create the routing table, and there's precisely one routing table on such a system. The advanced routing features allow you to use multiple routing tables, which you set up with the ip route command. You can then quickly switch between different routing tables for handling different types of traffic, using other routing tools. This command is more complex than the normal route , but its features are mostly a superset of the normal route command. Thus, you can use ip route much as you would route , as described in Chapter 2. One extension is particularly important, though: You can specify the routing table number with the table table-id option. For instance, you might use the following command to add a route to routing table 2:

ip route add 10.201.0.0/16 dev eth1 table 2

Aside from the leading ip and the trailing table 2 , this command works just like an equivalent route command. Specifically, it tells the system to pass all data for the 10.201.0.0/16 network over eth1 without sending it to another router. (In this case, eth1 should have an address on the 10.201.0.0/16 network.)

Using tc

The tc utility is what utilizes the QoS and/or Fair Queueing kernel configuration options. You can use it to manage outgoing network bandwidth, in order to prevent one class of traffic from monopolizing the available bandwidth. For instance, suppose your organization has two subnets, each corresponding to an office with a dozen users. If a user from one of these offices begins using some very bandwidth- intensive task, this action may degrade network performance for users in the other office. You can use tc to provide a partial fix by guaranteeing a certain amount of bandwidth for each subnet.

NOTE

It's important to remember that a TCP/IP router (or any computer on a TCP/IP network) can only control its outgoing traffic. Thus, tc can only adjust outgoing bandwidth. This works in a router because a sender will slow its transmission of TCP packets when it sees that your router is saturated , even if that saturation is created through a QoS policy. (This won't work for UDP packets, though.)

The basic syntax of tc is as follows:

tc [ options ] object command

Each of the parameters has certain possible values:

  • options ” This can be -statistics (or -s ), -details (or -d ), or -raw (or -r ).

  • object ” This can be qdisc , class , or filter . The qdisc sets the queueing discipline ”a specific rule. The class defines a set of packets that fit a category (such as one of the two offices). The filter brings these together to generate a filter rule.

  • command ” The command is a set of parameters that define precisely what tc does with the object . What goes into a command is quite varied and object -specific.

To use tc , you use it to generate a series of rules that together define the networks to which the computer is connected and how the available bandwidth should be allocated among these networks. For instance, suppose you want to implement a 50/50 split of 100Mbps of outgoing bandwidth between two offices. The Internet at large is on eth0 , and both offices are on eth1 , although one uses the 192.168.1.0/24 IP address subnet and the other uses 192.168.2.0/24. To begin the process, use tc to initialize a queueing discipline on eth1 :

# tc qdisc add dev eth1 root handle 10: cbq bandwidth 100Mbit \ avpkt 1000

This command can be broken down into several parts :

  • add dev eth1 ” This tells the system that you're adding a queueing discipline for eth1 .

  • root ” Some disciplines arrange themselves in virtual trees that branch off of a "root." This parameter tells tc that you're creating a new root for the tree.

  • handle 10: ” This parameter defines a label ( handle ) for the discipline.

  • cbq ” You must tell the system which queueing method to use. The Class-Based-Queueing (CBQ) method is a common one. This entry should correspond to the name of a specific option in the QoS and/or Fair Queueing kernel configuration menu.

  • bandwidth 100Mbit ” You must tell the system how much bandwidth is available on the network. In the case of a router with differing bandwidth on its separate ports, this will normally be the lesser bandwidth value; you don't want to overschedule the bandwidth that's actually available.

  • avpkt 1000 ” Network packets vary in size, but to schedule bandwidth use, the system must have some idea of what the average packet size will be. One thousand is a reasonable first guess, but it might be higher or lower on particular networks.

Now it's time to define classes for the network as a whole and for each of the subnets whose bandwidth you want to guarantee. You can do so with commands like the following:

# tc class add dev eth1 parent 10:0 classid 10:1 cbq \ bandwidth 100Mbit rate 100Mbit allot 1514 weight 10Mbit \ prio 8 maxburst 20 avpkt 1000

This command is very much like the previous one, but it sets up a class that defines one of the two subnets. Note that it sets up the class to use the entire 100Mbps available bandwidth, because this particular class corresponds to the root; subsequent commands subdivide this bandwidth. This command has a few extra parameters and other differences, compared to the previous tc command:

  • class ” Rather than qdisc , this command uses class to define the class.

  • parent 10:0 ” You specify the parent ”the root of the tree ”with this parameter. Note that you add to the handle specified with the previous command.

  • classid 10:1 ” This is the identifier for this particular class.

  • allot 1514 ” This is the MTU value (plus a few bytes overhead) for the network.

  • weight 1Mbit ” This is a tuning parameter, and may need to be adjusted for your network.

  • prio 8 ” This is a priority number. The higher the priority number, the more priority the rule gets.

The rules for the individual subnets look very much like the last one:

# tc class add dev eth1 parent 10:1 classid 10:100 cbq \ bandwidth 100Mbit rate 50Mbit allot 1514 weight 5Mbit \ prio 5 maxburst 20 avpkt 1000 bounded # tc class add dev eth1 parent 10:1 classid 10:200 cbq \ bandwidth 100Mbit rate 50Mbit allot 1514 weight 5Mbit \ prio 5 maxburst 20 avpkt 1000 bounded

These commands are nearly identical; they differ only in their classid settings. Both refer to the root class as a parent, and both set up a 50Mbps bandwidth allotment. (You can create an asymmetrical allotment if you like ”say, 60Mbps and 40Mbps.) The bounded option tells Linux to not give more than the allotted bandwidth to a network class under any circumstances. This is often inefficient, because if one office isn't using its full allotment, the other can't use the unused amount. Omitting the bounded option gives Linux the flexibility to let one office " borrow " bandwidth if the other isn't using it, while enforcing a 50/50 split if both want bandwidth.

Now it's necessary to associate a queueing discipline with each of the two classes:

# tc qdisc add dev eth1 parent 10:100 sfq quantum 1514b \ perturb 15 # tc qdisc add dev eth1 parent 10:200 sfq quantum 1514b \ perturb 15

These commands are similar to the original queueing discipline assignment. They tell Linux to use the Stochastic Fairness Queueing (SFQ) discipline to schedule traffic within each office's subnet. SFQ is popular for this purpose because it requires little CPU power, but other disciplines can be used if desired.

The commands to this point haven't provided a means for the kernel to differentiate traffic from the two offices (192.168.1.0/24 and 192.168.2.0/24). The final two commands accomplish this goal:

# tc filter add dev eth1 parent 10:0 protocol ip prio 100 u32 \ match ip dst 192.168.1.0/24 flowid 10:100 # tc filter add dev eth1 parent 10:0 protocol ip prio 100 u32 \ match ip dst 192.168.2.0/24 flowid 10:200

These commands are similar to the preceding ones, but they set up a filter rule to move traffic destined towards ( dst ) each of the two networks through the appropriate classes. Each rule is given an equal priority, and is matched using the u32 algorithm, which works on IP address blocks.

The preceding rules control the flow of data from the Internet to the local systems. To be complete, you must create a similar set of rules that control data passing in the opposite direction. These rules would resemble the preceding ones, but they would refer to eth0 (the external interface) rather than eth1 (the internal interface), and the final two filter commands would use src rather than dst to indicate that they control traffic originating from a local source rather than a destination.

Категории