Teach you how to grab bandwidth with TCP

What should I do if the network is too slow and jammed?

Doing it yourself is not suitable for everyone, and it does not mean that you can log in to the server and play tricks in every scenario. At this time, you have to use money to solve the problem and pay more money to the operator for better services.

Paying more money means getting higher priority, higher bandwidth and service quality, but priority is a relative concept, the total amount of resources remains unchanged, priority means price involution, whoever pays more gets priority. Bandwidth is a cross-section and an instantaneous quantity. On a statistical multiplexing network, it cannot be guaranteed to be an average value.

There are many explanations for spending 100 yuan a month to promise you 100Mbps, but users often think that this is just a promise, that is, 100Mbps is available at any time within the month, but this is technically impossible. The characteristics of Internet traffic are that bursts and silences coexist. To obtain high bandwidth utilization, it is necessary to have enough flows to statistically smooth the gap between bursts and silences. “Enough” is also a statistical description, and it itself fluctuates, and It’s not that there is “so much” all the time. Therefore, even if the average is calculated and divided, the contract bandwidth cannot be redeemed immediately.

Similar to priority involution, users compete to pay more to obtain higher bandwidth, but the total bandwidth remains unchanged. More money does not bring more bandwidth, but only a higher bandwidth ratio, and usage A larger inflight can crowd out the buffer, which is the same as the illusion of crowding out a higher bandwidth ratio. Just compare the money to the buffer. If there is no price limit, it will inevitably lead to price involution and enter the auction mode. You have to pay more to “keep” (rather than obtain) the same (rather than more) bandwidth than before because others are willing to pay more.

To sum up, operators can make money by relying on user bidding, so there is no incentive to upgrade and expand capacity. This extremely static model is something no one wants to see. In reality, it is often either an average monthly subscription with occasional discounts, or 95-like billing, which is obviously much more reasonable.

If you only consider the cost of congestion, 95 billing is not relevant. I mean, don’t pay for the costs, but pay taxes for the benefits, which is much easier to understand.

The smaller the message interval, the higher the charge. The tax rate is set to a reasonable regular pacing message interval, so the tax to be paid is the sum of the intervals of all messages in the sending sequence multiplied by the applicable tax rate. As a result, whoever has a higher burst rate will pay a higher tax, thus urging everyone to send data as smoothly as possible to reduce congestion. It has the same meaning as the congestion tax in big cities. But it must not be prepaid, because prepayment will almost certainly go into involution.

Sell access to resources, not the resources themselves.

We have talked about how accounting can alleviate congestion. Now let’s assume that the above-mentioned “congestion tax” has become a reality, and you have enough money to pay a high enough tax. In other words, you don’t care about money, and you can also log in to the machine and mess around. , how to grab bandwidth?

This article provides two ways to grab bandwidth:

Constant rate sending, no congestion control, sending at a rate of, say, 100Mbps anyway.
Maintain throughput, do not perform congestion control, and try to maintain an effective throughput of, for example, 100Mbps no matter what.

The threshold issue must be mentioned. It is not difficult to write a TCP packet sending (cc will not be mentioned here, because in addition to applying the Linux tcp cc framework, the rest has nothing to do with congestion control. On the contrary, it is creating congestion) algorithm. The trouble is that many people I don’t know how to put some of my simple ideas into practice, but this is very simple. Once you do it, you will never forget it. For Linux kernel version 4.9 or above, just write a kernel module. For details, see Hard-coding cwnd (especially the systemtap method before the 4.9 kernel).

There is not much to say about the Linux tcp cc framework. It brings both opportunities and constraints. Relentless cc cannot enjoy it because it cannot get rid of the constraints of the Linux cc state machine:

Our implementation is not quite as elegant, and consequently is not as accurate as the mental model. It is based on an existing TCP implementation that supports plugable congestion control modules. Unfortunately, some of the cwnd adjustments are wired into the loss recovery code and can not be suppressed directly. They can only be offset later in the plugable Relentless Congestion Control module.

Next, we will give you a method on how to maintain hesitation: https://github.com/marywangran/pixie (pixie does not mean “leather shoes”, it means “troublemaker”. Look it up in the dictionary yourself. I don’t know much about this word. Code I’m afraid it won’t open, so I’ll post a text at the end and write the Makefile myself)
There are two important points:

Using the sliding average method, when the packet loss suddenly increases, it will be tight to avoid further packet loss, and when the packet loss suddenly decreases, it will be aggressive and squeeze out multiple resources. In order to achieve simplicity, only arithmetic average, moving exponential average will be better.
Use pacing_rate pacing instead of cwnd pacing. pacing_rate is certain. Ideally, cwnd squeezing is useless. The extra cwnd has not been sent yet, and feedback is returned. At this time, inflight < cwnd.

It is worth mentioning that although this kind of thing is effective, it is not worth mentioning. A TCP algorithm that is truly worth mentioning must not harm others. Whether it is self-serving or not is its own business, but secondary. Therefore, if an algorithm cannot be mathematically proven and demonstrated in typical practice that it is harmless, then it will never be widely deployed.

The logic behind this consideration is that TCP is too widely used and involves almost everything on the Internet. The Internet side is “almost” TCP. If a random algorithm is widely deployed, there will be slight inadvertent flaws (whether intentional or unintentional) It will harm the entire Internet transmission and have a huge impact. Even though algorithms like bbr have appeared and been gradually deployed for 7 years, its current status still has not entered the formal rfc, and the widespread deployment of bbr is purely an engineering consequence. In theory, the risk is still huge.

On the other hand, if you only use it for your own use in a small area, the bandwidth you are able to seize is just a drop in the ocean for the real Internet link bandwidth from the edge to the backbone, and your existence does not matter. Furthermore, before you generalize this algorithm enough to be perceived, non-technical forces will step in, forcing you to stop your behavior and pay the price of the damage caused.

quic raises concerns about the prospect of congestion control, and it’s too easy to customize cc within an application. It is easy for cdn manufacturers to deviate from universal standards (such as AIMD) and write CC by magic, just like me, a brainless pixie, and then proudly go online. In addition, Internet company technical personnel lack knowledge of this field. With the double massive amount of Internet companies (massive lack of Cognitive programmers deploy massive services) pushing and pulling to death, just like the highways are filled with Tesla upstarts who have just gotten their driver’s licenses and bought cars. The overall future of the Internet is worrying.

So, small things make you happy, big things hurt your body, and you will have to pay them back sooner or later, so don’t take advantage of them.

Attachment: tcp_pixie.c

#include <linux/module.h>
#include <net/tcp.h>
#include <linux/inet.h>

static int rate = 100000000;
module_param(rate, int, 0644);
static int feedback = 2;
module_param(feedback, int, 0644);

struct sample {<!-- -->
u32 _acked;
u32 _losses;
u32 _tstamp_us;
};

struct pixie {<!-- -->
u64 rate;
u16 start;
u16 end;
u32 curr_acked;
u32 curr_losses;
struct sample *samples;
};

static void pixie_main(struct sock *sk, const struct rate_sample *rs)
{<!-- -->
struct tcp_sock *tp = tcp_sk(sk);
struct pixie *pixie = inet_csk_ca(sk);
u32 now = tp->tcp_mstamp;
u32 cwnd;
u16 start, end;
u64prate;

if (rs->delivered < 0 || rs->interval_us <= 0)
return;

cwnd = pixie->rate;
if (!pixie->samples) {<!-- -->
cwnd /= tp->mss_cache;
cwnd *= (tp->srtt_us >> 3);
cwnd /= USEC_PER_SEC;
tp->snd_cwnd = min(2 * cwnd, tp->snd_cwnd_clamp);
sk->sk_pacing_rate = min_t(u64, pixie->rate, READ_ONCE(sk->sk_max_pacing_rate));
return;
}

  // When packet loss suddenly increases, a tightening strategy is used to minimize further packet loss while maintaining throughput. When packet loss suddenly decreases, a radical strategy is used to squeeze out more resources than actual. Moving average is a good idea.
pixie->curr_acked + = rs->acked_sacked;
pixie->curr_losses + = rs->losses;
end = pixie->end + + ;
pixie->samples[end]._acked = rs->acked_sacked;
pixie->samples[end]._losses = rs->losses;
pixie->samples[end]._tstamp_us = now;

start = pixie->start;
while (start < end) {<!-- -->
    // Maintain at least half of the srtt feedback cycle. The longer it is, the less jittery it will be, but the performance may not meet expectations. The "jitter" here must be understood in reverse.
if (2 * (now - pixie->samples[start]._tstamp_us) > feedback * tp->srtt_us) {<!-- -->
pixie->curr_acked -= pixie->samples[start]._acked;
pixie->curr_losses -= pixie->samples[start]._losses;
pixie->start++;
}
start + + ;
}
cwnd /= tp->mss_cache;
cwnd *= pixie->curr_acked + pixie->curr_losses;
cwnd /= pixie->curr_acked;
cwnd *= (tp->srtt_us >> 3);
cwnd /= USEC_PER_SEC;

prate = (pixie->curr_acked + pixie->curr_losses) << 10;
prate /= pixie->curr_acked;
prate *= pixie->rate;
prate = prate >> 10;

printk("##### curr_ack:%llu curr_loss:%llu rsloss:%llu satrt:%llu end:%llu cwnd:%llu rate:%llu prate:%llu\
",
pixie->curr_acked,
pixie->curr_losses,
rs->losses,
pixie->start,
pixie->end,
cwnd,
rate,
prate);
tp->snd_cwnd = min(cwnd, tp->snd_cwnd_clamp);
  // Use pacing_rate to run (instead of cwnd) to encourage more packets to arrive per unit time to offset packet loss. If you use cwnd, don't pacing.
sk->sk_pacing_rate = min_t(u64, prate, sk->sk_max_pacing_rate);
  //sk->sk_pacing_rate = min_t(u64, pixie->rate, sk->sk_max_pacing_rate);
}

static void pixie_init(struct sock *sk)
{<!-- -->
struct pixie *pixie = inet_csk_ca(sk);

pixie->rate = (u64)rate;
pixie->start = 0;
pixie->end = 0;
pixie->curr_acked = 0;
pixie->curr_losses = 0;
pixie->samples = kmalloc(U16_MAX * sizeof(struct sample), GFP_ATOMIC);
cmpxchg( & amp;sk->sk_pacing_status, SK_PACING_NONE, SK_PACING_NEEDED);
}

static void pixie_release(struct sock *sk)
{<!-- -->
struct pixie *pixie = inet_csk_ca(sk);

if (pixie->samples)
kfree(pixie->samples);
}


static u32 pixie_ssthresh(struct sock *sk)
{<!-- -->
return TCP_INFINITE_SSTHRESH;
}

static u32 pixie_undo_cwnd(struct sock *sk)
{<!-- -->
struct tcp_sock *tp = tcp_sk(sk);
return tp->snd_cwnd;
}

static struct tcp_congestion_ops tcp_pixie_cong_ops __read_mostly = {<!-- -->
.flags = TCP_CONG_NON_RESTRICTED,
.name = "pixie",
.owner = THIS_MODULE,
.init = pixie_init,
.release = pixie_release,
.cong_control = pixie_main,
.ssthresh = pixie_ssthresh,
.undo_cwnd = pixie_undo_cwnd,
};

static int __initpixie_register(void)
{<!-- -->
BUILD_BUG_ON(sizeof(struct pixie) > ICSK_CA_PRIV_SIZE);
return tcp_register_congestion_control( & amp;tcp_pixie_cong_ops);
}

static void __exit pixie_unregister(void)
{<!-- -->
tcp_unregister_congestion_control( & amp;tcp_pixie_cong_ops);
}

module_init(pixie_register);
module_exit(pixie_unregister);
MODULE_LICENSE("GPL");

Leather shoes in Wenzhou, Zhejiang get wet, and if they get wet when it rains, they won’t get fat.