Diagnosing a weird LoL lag issue

I’m having a weird lag problem where I get massive packet loss playing LoL. But the reported ping stays low (35ms) and the game is mostly playable. Some small hiccups that look lag related, more than I’m used to, but it’s not hugely awful. Really don’t know what’s going on.

My logs are telling me I’m losing 2.4% of my packets. That seems awfully bad; 0.1% or so is more typical. Borderline surprised it’s playable, but who knows. It’s not clear why the LoL client still reports the 35ms ping; the usual client behavior is if you’re losing packets the reported ping goes up. Or at least it used to be, maybe that changed?

I’ve done every test I can think of to prove that it’s not my home network in general to blame. I can iperf3 burst UDP traffic all day reliably from my house to somebits.com in Kansas. That shares the same route over my ISP as far as cr1-9greatoaks-te-0-7-0-8.bb.spectrumnet.us; then the somebits packets go through he.net on their way to Kansas while the LoL packets go through spectrumnet up to Portland and Riot’s datacenter. That route looks fine, as well as I can test. (with mtr -u, a UDP traceroute).

Confusing things more, a friend of mine also plays on the same ISP in San Francisco. His client is not showing any lag like I am. But we share almost the same route, our routes converge on the fourth hop from 76-14-93-222.sf-cable.astound.net / 76-14-93-218.sf-cable.astound.net onward. So if there were a bad router it’d be close to my house. But then my link to somebits.com goes through that same route and I can test that thoroughly, and it’s showing no problem.

Frankly all signs point to something being wrong with my Mac, the client itself. Only I can reproduce the problem on a second machine, my laptop! Could it be my router? Possibly. But it’s worked fine in the past. I’ll try to test without the router soon.

Here’s some stats from a Wireshark capture I did in an ARAM I just played. I was Amumu.

LoL server address: 192.64.170.78

Wireshark statistics

  • 39750 packets from server, 44769 packets from client
  • 2003 packets reported lost by the client. 2.4% packet loss rate!
  • 39kbps from server, 17kbps from client
  • Average packet size 228 bytes

"Address","Port","Packets","Bytes","Packets A → B","Bytes A → B","Packets B → A","Bytes B → A","Latitude","Longitude"
"192.64.170.78",5107,84517,9616369,39748,6599894,44769,3016475,"-","-"
"192.168.0.20",53040,84517,9616369,44769,3016475,39748,6599894,"-","-"


==================================================================================================================================
Packet Lengths:
Topic / Item       Count         Average       Min val       Max val       Rate (ms)     Percent       Burst rate    Burst start  
----------------------------------------------------------------------------------------------------------------------------------
Packet Lengths     84519         227.56        54            1038          0.0627        100%          0.2400        73.682       
 0-19              0             -             -             -             0.0000        0.00%         -             -            
 20-39             0             -             -             -             0.0000        0.00%         -             -            
 40-79             46272         62.08         54            79            0.0343        54.75%        0.1600        2.285        
 80-159            26150         101.90        80            159           0.0194        30.94%        0.0900        838.818      
 160-319           7204          228.76        160           319           0.0053        8.52%         0.0500        351.673      
 320-639           4120          437.28        320           639           0.0031        4.87%         0.0600        73.648       
 640-1279          773           814.54        640           1038          0.0006        0.91%         0.0600        73.717       
 1280-2559         0             -             -             -             0.0000        0.00%         -             -            
 2560-5119         0             -             -             -             0.0000        0.00%         -             -            
 5120 and greater  0             -             -             -             0.0000        0.00%         -             -            

----------------------------------------------------------------------------------------------------------------------------------


$ sudo mtr -u 192.64.170.78
                                   My traceroute  [v0.85]
ub (0.0.0.0)                                                        Tue Aug  4 17:43:25 2015
Keys:  Help   Display mode   Restart statistics   Order of fields   quit
                                                    Packets               Pings
 Host                                             Loss%   Snt   Last   Avg  Best  Wrst StDev
 1. router.nelson.monkey.org                       0.0%   981    0.2   0.2   0.2   0.7   0.0
 2. 76-14-35-1.sf-cable.astound.net                0.0%   981    9.8  11.8   4.0 156.4   9.6
 3. 104.220.254.17                                 0.0%   981    7.4   9.2   3.6  31.3   4.0
 4. 76-14-93-218.sf-cable.astound.net              0.0%   981   14.6  17.2   7.7 234.4  22.8
    76-14-93-222.sf-cable.astound.net
 5. cr1-wsache-a-be-100.bb.spectrumnet.us          0.0%   981   11.8  14.2   9.1  37.2   3.8
 6. cr1-55SMarket-te-0-0-0-8.bb.spectrumnet.us     0.0%   981   11.1  14.7  10.3  37.6   4.7
    cr1-55SMarket-te-0-0-0-10.bb.spectrumnet.us
    cr1-55SMarket-te-0-0-0-9.bb.spectrumnet.us
    cr1-55SMarket-te-0-0-0-19.bb.spectrumnet.us
    cr1-55SMarket-te-0-0-0-11.bb.spectrumnet.us
 7. cr1-9greatoaks-te-0-7-0-6.bb.spectrumnet.us    0.0%   981   11.5  14.2   8.2  36.4   3.7
    cr1-9greatoaks-te-0-7-0-8.bb.spectrumnet.us
    cr1-9greatoaks-te-0-7-0-9.bb.spectrumnet.us
    cr1-9greatoaks-te-0-7-0-7.bb.spectrumnet.us
 8. cr1-pdx-te-0-0-0-4.bb.spectrumnet.us           0.1%   981   44.3  30.8  23.8  56.2   4.5
    cr1-pdx-te-0-0-0-5.bb.spectrumnet.us
    cr1-pdx-te-0-0-0-6.bb.spectrumnet.us
 9. cr2-fdcp-t4-4.bb.spectrumnet.us                0.0%   981   29.4  34.2  25.2 376.9  25.7
10. 216.243.25.86                                  0.0%   981   28.3  29.9  24.8  51.3   4.0
11. ???


Graph of packets / second

Packets per second

Update

Some updates after a few experiments

  • Problem definitely occurs on two different computers, an iMac and a Macbook Air
  • Problem occurs without my router / switch. Still plugged into my cable modem.
  • I played a game with a friend of mine who happens to also be in San Francisco on the same ISP and plays on a Mac. He had no lag, I did. Here’s my Logs of Lag report and here’s his. I lost 80 packets a minute, he lost 2. WTF? According to traceroute our routes are the same after the 4th hop above, so if it’s a problem in the route it’s something very close to my house at my ISP.
  • However, I can happily shove a huge amount of UDP traffic into my house with no problems. And those packets also traverse that same route to hop 4.
  • I see the same problem on a Windows box in my house. (35 packets lost / minute in a Custom game, which is line with results on the Macs I play on.)

My next step is to try playing through a VPN and see if it helps. I’d also like to understand better what it means for the LoL client to know it’s lost 2.4% of packets and yet still report a low ping. That seems odd; usually the reported ping number goes up in response to retransmits, or at least I thought it did.

Update 2

A friend of mine has a VPS at directspace.net, which is either in the same datacenter as Riot NA or else very nearby. He ran an  iperf2 server for me. Here’s some redacted data testing from my problem network.

iperf shows 69 packets out of 26224 lost, and 47 out of order. that’s about normal for a working link I think, and 1/10th the loss rate I’m seeing from the game.

mtr shows at worst 1 packet dropped out of 100. the stddev to hop 4 is higher than I’d like though.

$ iperf -l 300 -u -t 60 -i 10 -c XXX.XXX.XXX.XXX
------------------------------------------------------------
Client connecting to XXX.XXX.XXX.XXX, UDP port 5001
Sending 300 byte datagrams
UDP buffer size:  208 KByte (default)
------------------------------------------------------------
[  3] local 192.168.0.65 port 34071 connected with XXX.XXX.XXX.XXX port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  1.25 MBytes  1.05 Mbits/sec
[  3] 10.0-20.0 sec  1.25 MBytes  1.05 Mbits/sec
[  3] 20.0-30.0 sec  1.25 MBytes  1.05 Mbits/sec
[  3] 30.0-40.0 sec  1.25 MBytes  1.05 Mbits/sec
[  3] 40.0-50.0 sec  1.25 MBytes  1.05 Mbits/sec
[  3] 50.0-60.0 sec  1.25 MBytes  1.05 Mbits/sec
[  3]  0.0-60.0 sec  7.50 MBytes  1.05 Mbits/sec
[  3] Sent 26225 datagrams
[  3] Server Report:
[  3]  0.0-60.0 sec  7.48 MBytes  1.05 Mbits/sec   1.561 ms   69/26224 (0.26%)
[  3]  0.0-60.0 sec  47 datagrams received out-of-order

$ mtr --report -c 100 -u XXX.XXX.XXX.XXX
Start: Wed Aug  5 21:37:40 2015
HOST: ub                          Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- router                     0.0%   100    0.2   0.2   0.2   0.5   0.0
  2.|-- 76-14-35-1.sf-cable.astou  1.0%   100    9.3  11.0   5.4  57.1   7.2
  3.|-- 104.220.254.17             0.0%   100    9.9   9.4   5.0  28.1   4.1
  4.|-- 76-14-93-222.sf-cable.ast  0.0%   100   14.5  15.0   9.9 114.6  11.0
  5.|-- cr1-wsache-a-be-100.bb.sp  0.0%   100   11.9  14.9  10.2  33.2   3.7
  6.|-- cr1-55SMarket-te-0-0-0-8.  0.0%   100   11.9  13.4  10.4  32.0   2.3
  7.|-- cr1-9greatoaks-te-0-7-0-8  0.0%   100   14.6  15.1  10.3  33.1   4.6
  8.|-- cr1-pdx-te-0-0-0-4.bb.spe  0.0%   100   26.6  29.8  26.2  47.5   3.9
  9.|-- cr2-pdx-te-0-1-0-4.bb.spe  0.0%   100   29.7  29.6  26.1  47.8   3.3
 10.|-- directspace.nwax.net       1.0%   100   29.9  31.8  26.0 166.0  15.4
 11.|-- 69-163-34-113.in-addr.arp  0.0%   100   28.1  28.8  25.2  31.4   1.2
 12.|-- XXX.XXX.XXX.XXX            0.0%   100   28.6  29.3  26.1  47.0   2.7

Update 3: I have a theory: out of order packets. My mtr tests are telling me there’s something a bit flaky between hops 3 to 4. Occasionally a packet will take a very long time, like 200ms, and the stddev on that hop is way higher than the others. I wonder if an out of order packet causes the LoL client to drop it or otherwise treat it as a lost packet? To further complicate things, mtr -u shows me that there are many different paths on my route. All the same lengths, but node 4 (the suspect one) has a couple of different IP addresses that show up. If only one path is congested that will cause a lot of out of order packets.

I don’t know what LoL does with out of order packets. ENet, the protocol LoL is based on, is pretty clear about what it does:

ENet provides sequencing for all packets by assigning to each sent packet a sequence number that is incremented as packets are sent. ENet guarantees that no packet with a higher sequence number will be delivered before a packet with a lower sequence number, thus ensuring packets are delivered exactly in the order they are sent.

For unreliable packets, ENet will simply discard the lower sequence number packet if a packet with a higher sequence number has already been delivered. This allows the packets to be dispatched immediately as they arrive, and reduce latency of unreliable packets to an absolute minimum. For reliable packets, if a higher sequence number packet arrives, but the preceding packets in the sequence have not yet arrived, ENet will stall delivery of the higher sequence number packets until its predecessors have arrived.

But I don’t know how LoL uses these features, in particular what packets are marked reliable / unreliable.

The good thing about this theory is it explains a lot of the facts I see, including why my friend at the same ISP doesn’t have the same lag. (He’s beyond hop 4!). The bad thing about this theory is I still can’t really reproduce the presumed failure with iperf. iperf3 does report out of order packets (as an error message!) but I’ve only been able to induce it to do that with ridiculously high bandwidth tests, and even then not reliably. But maybe iperf3 is really polite and sends packets spaced exactly N ms apart so is less likely to have an ordering problem?

Here’s an ICMP traceroute showing the high stddev at node 4.

$ mtr -n somebits.com

                             My traceroute  [v0.85]
ub (0.0.0.0)                                           Wed Aug  5 23:03:04 2015
Keys:  Help   Display mode   Restart statistics   Order of fields   quit
                                       Packets               Pings
 Host                                Loss%   Snt   Last   Avg  Best  Wrst StDev
 1. 192.168.0.1                       0.0%   539    0.3   0.3   0.2  29.1   1.2
 2. 76.14.35.1                        0.4%   539    8.0  11.0   4.7  60.6   7.7
 3. 104.220.254.17                    0.0%   539    8.0   9.6   5.5  30.4   4.7
 4. 76.14.93.222                      0.6%   539   13.5  17.0   9.7 229.9  22.1
 5. 208.76.187.33                     0.0%   539   28.9  14.1  10.2  36.7   4.1
 6. 208.76.185.26                     0.0%   538   12.8  14.6   9.0  35.5   4.8
 7. 208.76.185.58                     0.0%   538   13.6  14.0  10.0  36.8   3.7
 8. 206.223.116.37                    0.0%   538   14.7  16.5   8.5  44.6   5.6
 9. 184.105.213.106                   0.0%   538   38.3  44.2  37.0  74.5   5.9
10. 184.105.222.22                    0.2%   538   91.2  82.6  62.7 121.7  14.5
11. 184.105.213.37                    0.0%   538   71.1  69.6  62.9  96.7   5.8
12. 216.66.78.90                      0.2%   538   56.5  56.6  48.1  83.5   6.2
13. 69.30.209.138                     0.0%   538   49.6  54.4  47.9  83.6   6.1
14. 192.187.107.5                     0.0%   538   53.4  54.9  48.2  87.9   5.4
15. 107.150.51.74                     0.0%   538   51.3  52.1  47.4  96.1   4.8

Update 4A VPN fixes my problem! I sort of predicted it would, glad to confirm my hunch and also nice to have a workaround until the real problem is fixed. I set up a TCP VPN through Cloak, promised to US West Coast. I ended up going through Seattle. The resulting log shows a little more latency than without the VPN but also nearly no packet loss. Yay!

I set up Cloak because they have a really great Mac client, easy to get going. If I make a habit of this I may consider a VPN endpoint I own up in Seattle / Portland. Would rather just fix my ISP problem though. Don’t want to be paying $10 a month just to fix someone else’s broken service.

Update 5: this problem went away, and I don’t know why. I’m now playing games with 0.2 packets / s, very good, and no obvious lag. It may have been fixed by the Riot server move to Chicago, that’s the most likely explanation. But I’m still struck by the way I was getting this lag and my friend playing the same game with me on the same ISP wasn’t! That fourth hop from me is still showing suspiciously high standard deviation, so whatever that means it may well be unrelated.

2 thoughts on “Diagnosing a weird LoL lag issue

  1. Cloak’s default openvpn config seems to be UDP. Did you end up changing it? Because if not, then the whole thing is even more mysterious!

Comments are closed.