OpenWRT vs Starlink: DHCP leases

Finally captured a rumored bug as it was happening. After Starlink Dishy resets itself, the home router can’t get a working Internet connection. (Note: this is not using Starlink’s router). Folks report various ways to fix it, often by plugging in Starlink’s router temporarily. I think resetting your home router alone is sufficient. Or even less intrusive; resetting the WAN port. Some folks on Reddit call this the “DHCP loop” behavior.

In summary; Dishy hands out a temporary IP address 192.168.100.100 with a very short DHCP lease. This seems to confuse routers which keep renewing the temporary address rather than getting a permanent 100.64.*.*/10 address.

Dishy behavior

Dishy is itself a DHCP server; it acts more or less similar to a cable modem. The reported behavior of Dishy is that right after it reboots and before its acquired satellites, the DHCP server hands out an address of 192.168.100.100 to any router that asks as a temporary address. Crucially, the DHCP lease on this is a tiny 5 seconds. It also has a gateway of 192.168.100.1 (Dishy itself) but this gateway doesn’t really work.

Once Dishy configures itself with Starlink it changes and instead gives out a semi-permanent CgNAT address like 100.71.6.46. It expects the router to switch to this new address, the published final gateway is something like 100.127.255.3. This DHCP lease is for 5 minutes.

Everything about what Dishy is doing makes sense to me and seems reasonable. The only slightly odd thing is the 5 second DHCP lease on the temporary, but presumably that’s set because Dishy knows it wants to swap the temporary address out quickly.

OpenWRT behavior

Unfortunately OpenWRT doesn’t react well to Dishy’s DHCP hijinx. In practice what happens is OpenWRT stays stuck at the old 192.168.100.100 address and never switches over. here’s a screenshot of the bad state in OpenWRT’s status panel:

I tried prying in to OpenWRT a little more closely to see what was going on. I’m fairly ignorant of how it works though, so didn’t get far.

The router is running a persistent process to maintain this DHCP lease:

udhcpc -p /var/run/udhcpc-eth0.pid -s /lib/netifd/dhcp.script -f -t 0 -i eth0 -x hostname:OpenWrt -C -O 121

That all looks normal enough. Unfortunately I wasn’t able to capture the DHCP traffic or otherwise debug the request/responses that were going on. However, syslogging was helpful:

Logs for normal operation, once every 2.5 minutes:
Apr 7 18:05:16 OpenWrt netifd: wan (1527): udhcpc: lease of 100.71.0.14 obtained, lease time 300
Apr 7 18:07:46 OpenWrt netifd: wan (1527): udhcpc: sending renew to 100.127.255.3

First DHCP success after router reboots:
Apr 7 18:15:15 OpenWrt netifd: wan (1527): udhcpc: sending select for 192.168.100.100
Apr 7 18:15:15 OpenWrt netifd: wan (1527): udhcpc: lease of 192.168.100.100 obtained, lease time 122
Apr 7 18:15:15 OpenWrt netifd: Interface ‘wan’ is now up

Afterwards, once a minute:
Apr 7 18:35:37 OpenWrt netifd: wan (1527): udhcpc: lease of 192.168.100.100 obtained, lease time 122
Apr 7 18:36:38 OpenWrt netifd: wan (1527): udhcpc: sending renew to 192.168.100.1

So here’s the weird thing. The router was given a lease of only 5 seconds. But it seems to be ignoring that and just blindly renewing the lease every minute, pretending it’s a 122 second lease. Worse it’s asking to renew the existing address and Dishy is gamely letting it. What we want is for the router to get a new address, not perpetually renew the old one.

I suspect all this buggy behavior is triggered by Dishy’s slightly odd short DHCP lease. But without digging in to OpenWRT’s code I don’t really know.

Update: the 122 seconds comes from Busybox/udhcpc’s code.

/* paranoia: must not be too small and not prone to overflows */
/* timeout > 60 - ensures at least one unicast renew attempt */
if (lease_seconds < 2 * 61)
	lease_seconds = 2 * 61;

I first tried to fix this by sending SIGUSR2 to udhcpc; that signals it to release the address. The release went through but OpenWRT wasn’t smart enough to then decide to get a new address. (Reading the udhcpc docs, I should not be surprised this did not work). I finally just clicked the “reset” button next to the WAN port in the OpenWRT UI; that caused it to fully reset the link and get a new working address via DHCP. And all is well.

Bottom line: something needs to convince OpenWRT to get a new IP address after Dishy is fully configured. That should be automatic but it’s not working. Manually forcing a new IP address fixes it.

Updates

Currently running busybox 1.30.1-6 (for udhcpc) and netifd 2021-01-09-753c351b-1.

Someone on Reddit with the same problem.

I posted to OpenWRT forums about the problem.

Telegraf / InfluxDB / Grafana

For years and years and years I’ve been using Munin to monitor servers. Sure it’s old and limited but it still worked reliably and simply without too much software to install. But I’ve finally modernized, mostly so I could collect more data off my Starlink satellite.

I went with the combination of Telegraf, InfluxDB, and Grafana. The so called “TIG stack”. The end result is a system that works remarkably like Munin. Only instead of crappy Perl scripts there’s a clean fast Telegraf agent collecting data. The simple MTRR databases are replaced with InfluxDB, which seems a bit complicated for the purpose but works well. And rrdtool’s old busted graphs are replaced by Grafana, a very fancy enterprise-level dashboard construction system.

I followed a couple of guides for installing stuff: this one and this one. The install breaks down to “install some packages from third party repos” followed by “configure Telegraf and Grafana”.

The configuration is.. not simple. First you have to set up the database with a user, etc. Use the default names! They are baked in to things you will be using either. Then you have to set up Telegraf to log what you need. You can do this incrementally, you don’t have to get it all right at once, but there’s a whole lot of options and things to learn about. I think it took me almost two hours to tweak to what I wanted.

Grafana’s the real monster though. One nice thing about Munin is if you started logging a new number with a plug, the rest of the system would automatically figure out it should graph that number and show you. Not so with Grafana; instead you have to go through a very powerful but complicated web configuration to develop your systems. Fortunately Grafana is pretty good at auto-detecting variables and making some suggestions. The Explore query system is also good at figuring out what you might have in the database.

But the real win with Grafana is you can load other people’s dashboard designs from a community sharing site. For instance, I’m graphing most of my Telegraf system data using this dashboard. Notice how they helpfully supply a Telegraf config too; you can just copy and paste that stuff in and it will probably work.

Anyway, now I finally have some Starlink monitoring using sparky8512’s tools. It doesn’t use Telegraf but has its own little agent to log to InfluxDB directly, plus a sample Grafana dashboard. The dashboard is pretty limited, I’ll probably spend some time configuring my own eventually. But what’s there works for now.

Grafana Starlink screenshot

I was inspired by this Starlink enthusiast and his speedtest charts. Ironically the Starlink setup doesn’t give any speed tests, but I’ve added that using this telegraf plugin. Note if you use that you must install SpeedTest.net’s own client, not the one in the default Ubuntu repos.

All in all I’m pretty happy with the TIG stack. Despite (or maybe because of) all the complexity the new system consumes a lot less CPU than Munin did. And it’s way more flexible and modern. Sometimes it’s good to update to new things. It’s a shame it’s so complicated to configure though; it’s hard to imagine how Ubuntu could ship this as an easy turnkey install.

Starlink firmware revisions

Starlink doesn’t publish any sort of patch notes or update log. It’d be nice to have a community maintained list of firmware releases with whatever notes we can figure out.

Update: there’s now an editable spreadsheet of firmware revisions that should be updated in the future. The list below will not be updated.

See above; this data will go out of date.
4/1/2021   3912b4a3-9eaf-4f9e-81a3-18d8c837a26e.release
3/26/2021  b44f4294-6a78-4a57-b41c-5b613617086a.release
3/24/2021  bbd50ae9-da59-4f1d-b0e4-57c776b31ad1.release
3/22/2021  5f1ea9d9-7896-44da-821a-7a1ab07e78b9.release
3/21/2021  39d476b5-2102-44ec-bb7e-e526cc8f94f8.release
3/18/2021  d61f015c-556a-42b4-ac91-d8e41d157871.release
3/17/2021  19f05dfc-9d07-4989-b47f-87c8f87b0a25.release
3/09/2021  a8a9195a-8258-4dfc-8b5e-15f272cc2436.release
3/3/2021   848e54d2-015a-49cb-a814-34d7c5fc7e1a.release
2/19/2021  a95d0312-a6de-412e-9379-c6bee964f9e0.release Added mastNotNearVertical, slowEthernetSpeeds
2/15/2021  7db91a39-b61e-43fe-8bbe-ecb570197cae.release
2/9/2021   e68dfc80-fa1a-4fa4-9b21-d7ee2a918496.release
2/1/2021   3a1d5d0c-c93b-40b8-a884-021a9c1da1dc.release
1/24/2021  d9aff5fa-0334-49bc-b69d-b92a9c7871fa.release
1/9/2021   48b97eee-bda8-4593-b1d2-785ad4e493de.release (100 ms spikes gone)
12/19/2020 0a3e5881-1312-453f-9c97-8aa7fa2abb89.release
12/11/2020 e09928da-4e31-4040-a34e-61d38c10b37f.release
12/6/2020  5eb22757-5bc1-440f-ab64-9d5053986827.release
11/29/2020 37f8cd77-1e7c-47ea-b1a5-de99ebf16dee.release
11/23/2020 56d5e0aa-c1e8-4d25-aa65-dd8699aaaf62.release (SNR capped at 9)
11/20/2020 e2447448-74b0-42f7-b6c8-2bb5a608871a.release
11/10/2020 New version: ____758.release
11/?/2020  Old version: ____654.release
10/26/2020 ???

Sources:
https://www.reddit.com/r/Starlink/comments/lww8xx/starlink_firmware_update_on_332021/
https://www.reddit.com/r/Starlink/comments/m49cte/new_firmware_had_to_use_starlink_router_to/gqtbp7t/?utm_source=reddit&utm_medium=web2x&context=3
https://www.reddit.com/r/Starlink/comments/m8fpi3/starlink_version/grhaoa2/?utm_source=reddit&utm_medium=web2x&context=3

Starlink: router required?

Woke up this morning to my fancy new Starlink system not working. The culprit seems to have been the new firmware update. It applied but Dishy wasn’t working; lots of details in this Reddit post.

Annoyingly, the fix was to plug Dishy into Starlink’s router. It started working immediately. It’s still working, too, even though I plugged it back into my own router. (I haven’t tried rebooting Dishy yet.. I wonder if it will keep working?)

I sure hope Starlink doesn’t make their router mandatory. It’s really not suitable for anything other than the simplest home networks. It’s fixed at 192.168.1.*, has literally no configuration options other than the wifi password. I guess worst case I could put a second router in front of it and it’d be mostly harmless but it’ll add yet another layer of unnecessary NAT.

Update: a couple of Reddit commentors have said the real problem isn’t Dishy needing some magic config from the Starlink router, it’s that routers aren’t handling DHCP right. Read that linked post for a nice description of what Dishy is doing on startup. Before it’s got satellite lock it sends out the not-useful 192.168.100.100 IP address with a 5 second DHCP lease (very short!). Then it switches to a second IP useful 172.16.*.* IP address once it’s set up, with a 5 minute lease. The theory is some routers fail to update DHCP and switch to the new good address. I’m a little skeptical OpenWRT is doing something so dumb but since a couple of folks have offered it as an explanation I’d have to test it to be sure. Just have to wait for the next failure.

Ubuntu and Netplan: gotchas

I finally figured out a networking configuration problem on my Ubuntu 20.04 system.

In theory all Ubuntu systems since 2017 have been configured with Netplan. It’s some simplified high level configuration language that systemd applies. Like any Unix thing built after 1993 or so I reflexively hate it, I kinda miss the old days of /etc/interfaces and hacking stuff in /etc/rc.local. But I gotta admit Netplan is pretty good. When it works; the problem I’ve had is my config gets overridden.

The first problem is cloud-init, some thing that Ubuntu ships to configure “cloud servers” from yet other files. I believe default Ubuntu systems are configured via /etc/netplan/50-cloud-init but if you edit that file, joke’s on you; it gets rewritten by something else. The solution is to create a file /etc/cloud/cloud.cfg.d/99-custom-networking.cfg with the contents network: {config: disabled}. That will prevent cloud-init from rewriting your files.

Having done that you can either delete 50-cloud-init or replace it with your own stuff. Note that config files are additive; I ended up configuring my ethernet twice until removing the cloud-init file.

The bigger problem I ran into is that I had a dhcpcd installed, the package dhcpcd5. I don’t quite know why, it’s probably something I did manually. But if that’s installed it will override the netplan files. You can have dhcp4: no right there in your netplan file and dhcpcd will still override your static configuration. I fixed this by uninstalling dhcpcd. I think the modern systemd world uses dh-client anyway.

Once that’s done netplan seems to operate as it’s supposed to, without mysterious other things overriding my configuration.

Speedify and multipath

Trying to make my Starlink setup reliable enough to use regularly, I think I hit on a good consumer solution: Speedify. It’s a VPN service whose client on your desktop / laptop computer is smart enough to bond two Internet links. It can use extra links to boost speed or reliabilty or both. It seems to work. I’m using this to paper over Starlink’s unreliability. I imagine it’s particularly useful to bond crappy hotel wifi with a cellular hotspot (assuming you have two wifi adapters.)

Before I go on about Speedify I want to mention Multipath TCP. That’s a general technology for doing something similar. The Linux kernel has support and there’s various implementations but I can’t tell if any of them are really usable; a lot look like research projects. The emphasis on TCP is also a limitation, I’d like a redundancy solution for all IP not just TCP. Kind of a different problem though.

The main thing I know about Speedify is I was watching a Youtube video and had an open ssh session and unplugged my ethernet cable that connects me to Starlink. But the video never stopped and the ssh session stayed alive, presumably using the Wifi backup to my other ISP. That’s neat! Also speed tests show I’m getting the full speed of my Starlink, at least sometimes, so it’s not constrained to the lowest common denominator. So far I’ve lost 0.4% of pings to 8.8.8.8 over most of an hour; that’s better than I see using either ISP alone.

The Windows install experience is a dream. Installs smoothly, autodetects your links, sets it all up and you’re running on Speedify in seconds. There’s a free no-account-required option for initial testing, limited to 2GB. The UI is pretty slick too, with decent monitoring and configurability.

One drawback is that Speedify seems to require different actual network devices to bond. What I’d like to do is use a single ethernet adapter for both my ISPs and have Speedify bond two separate gateways. It doesn’t have the UI to do this. (Maybe there’s a way to set up a virtual ethernet device in Windows? Dunno.) I’m working around this by having plugged in a separate WiFi adapter that’s connected to one ISP, with the ethernet to the other. That works fine actually.

Another drawback is that Speedify is a VPN service terminated in a data center. In my case, Fremont CA. That adds latency. Also a bunch of video services like Netflix and Amazon now refuse to serve video to VPN datacenters because they’re so often used to work around country restrictions, etc. Speedify has a special “turn this off for video” feature, but of course video is exactly when you might most want to have Speedify. Finally some folks complain that Speedify makes their links slower than necessary. Speedify, for their part, says you should expect 200-300Mbps.

A final limitation is that Speedify doesn’t give a lot of control over what it’s doing. The user can choose the bonding mode: speed, streaming, or redundant mode. Speed and redundant seem to be two extremes: either maximize performance (but no redundancy) or maximize reliability. Streaming mode is a “smart” middle ground; it monitors each stream and decides when you need more redundancy. Mostly they’re talking about audio and video streams but the marketing copy mentions gaming, too, so presumably that’s also covered. There’s some more technical details here.

Here’s a couple of DSL reports speed tests. These aren’t the best I’ve seen; I’ve gotten 150/20 through Speedify, which is about the limit of my Starlink throughput. (Although I’d expected 50Mbps on the uplink, needs further testing.) I noticed on the download speed test it was only using the Starlink whereas on upload it saturated both my Starlink and my WISP. I think that makes sense. First, redundancy seems more important on the uplink than the downlink. Second, my upload was really limited during these tests and so the WiFi meaningfully could supplement it.

I wish they had a whole-network product! You can probably hack something up using their Linux support and a clever router, their product page indicates they indicate they’re sort of OK with you doing that but might charge you accordingly. That’s what got me interested more generally in multipath technologies for Linux; it’d be nice to have a router designed to do this. I think the easiest way forward with what Speedify has right now is to set up a Raspberry Pi with three ethernet ports; two for inputs from ISPs, one for output to a regular router. You might be throughput limited by USB speeds (for the extra ethernet ports) or encryption speed, but I’m not sure what the limit would be.

I’m really impressed with Speedify so far. I’ll probably sign up for a plan ($3 to $6 a month) for at least a year, to tide me over with Starlink. Gonna run out my free trial here first though.

Update: after most of a day of using it I bought a year of Speedify. $65 with the coupon DEAL10. It really does work as advertised. None of my sessions dropped; World of Warcraft, persistent TCP sockets, two hour idle ssh sessions all stayed open and working. Really solid.

I’m not positive the performance is great but I may be overanalyzing. I’m in Streaming mode and I should be getting at least my Starlink speed, if not better. In practice the speed tests tell me I get about 100/15. Peak is 190/22. Then again that’s just about the average Starlink performance now, so maybe it’s right. It seems pretty smart about using my slower WISP link a lot less but the backup has been there for me every time I needed it. Latency seems fairly variable too, seeing 20-80ms where I’d expect 20-50ms from Starlink. None of this is bad and compared to an unreliable link, it’s terrific. But it does make me think if Starlink gets to 100% reliability I may stop using Speedify.

Now I want a little 4 ethernet port Linux box to run as a Speedify bonding appliance in front of my router! That, or Speedify built right into a router. That should be doable in theory. For some reason Speedify has positioned itself as not a router; they’re real clear it’s a client product.

Update 2: OpenMPTCrouter looks like a plausible base for a roll-your-own alternative to Speedify.

Starlink monitoring and hacking notes

Some collected information and what I’m doing about monitoring Starlink / Dishy’s performance. Note most of this stuff works with just Dishy, not the Starlink-supplied router.

Some facts

  • Dishy listens at 192.168.100.1 on ports 22, 80, and 9200. ssh, http, and gRPC. It also uses port 9201 for a different flavor of gPRC HTTP (for Android?)
  • The mobile app works by fetching web pages and/or gRPC data from Dishy.
  • Starlink’s router does gRPC on port 9000 and is the preferred source for the mobile app. It implements the ping interface, the thing that tells you your Fortnite lag.
  • You can view the stats direct from Dishy in any browser at http://192.168.100.1/support/statistics
    Note you have to open the browser window to max width; it’s formatting for the screen size, not window size.
  • Another Dishy URL: http://192.168.100.1/support/debug

Docs

Software

  • pinginfoview: nice lightweight Windows tool for testing if the connection is down
  • Starlink gRPC tools: Python code for reading gRPC stats off Dishy
    This seems to give very complete data from Dishy, including the history of dropped packets
  • dishyworld: a working dashboard based on Prometheus and Grafana.
    Has ambitious goals of collecting data from many users.
  • starlink-cli: Go code for getting Starlink data. I think it’s the base for dishyworld?
    This seems to give less data than the Python library above, at least by default
  • Better than Nothing Web Interface: a PHP dashboard
  • Better than Nothing stats: a mockup of a planned Javascript dashboard, successor to above
  • Obstruction view. Takes the debug info from Dishy about obstructions and draws it. I know the mobile app can do this but I can’t figure out how to do it again!
My obstructions

Starlink gRPC tools

This Python library seems to give more useful output than the other command line tool I could find so I’m pursuing it.
The full list of commands:
status, obstruction_detail, alert_detail, usage, bulk_history
ping_drop, ping_run_length, ping_latency, ping_loaded_latency

Be sure to read the code / docs for detailed notes on what all the fields mean. There’s some surprisingly sophisticated analysis going on when it’s looking at data samples to, say, characterize the latency of the last 5 minutes.

Some useful examples:

$ python dish_grpc_text.py -v status
id:                    ut01000000-00000000-000353f7
hardware_version:      rev1_pre_production
software_version:      848e54d2-015a-49cb-a814-34d7c5fc7e1a.release
state:                 CONNECTED
uptime:                130100
snr:                   9.0
seconds_to_first_nonempty_slot: 0.0
pop_ping_drop_rate:    0.0625
downlink_throughput_bps: 9143.5830078125
uplink_throughput_bps: 4387.2421875
pop_ping_latency_ms:   29.5
Alerts bit field:      0
fraction_obstructed:   0.006759106181561947
currently_obstructed:  False
seconds_obstructed:    266.0

# Dropped pings over the last hour
$ python dish_grpc_text.py -v ping_drop
current counter:       130202
All samples:           43200
Valid samples:         43200
Parsed samples:        3600
Sample counter:        130202
Total ping drop:       48.06441651657224
Count of drop == 1:    26
Obstructed:            23
Obstructed ping drop:  19.7899159938097
Obstructed drop == 1:  19
Unscheduled:           0
Unscheduled ping drop: 0.0
Unscheduled drop == 1: 0

# Bandwidth used for last 5 minutes
$ python dish_grpc_text.py -v usage -s 300
current counter:       131151
All samples:           43200
Valid samples:         43200
Parsed samples:        300
Sample counter:        131151
Bytes downloaded:      1327843
Bytes uploaded:        813427

Some of the ping_drop statistics are confusing; this github issue has notes on what they mean. “Unscheduled drop == 1” seems to exactly correlate to the stats UI’s “No Satellites” display, and “Obstructed drop == 1” is the Obstructed count. It’s less clear to me how to reconstruct the beta downtime statistics.

Starlink for real, LAN notes

March 9 is the first day I use Starlink for real for a whole day from my desktop computer. I’m writing this blog post VIA SPACE.

I mounted Dishy on a J-mount that was already installed at my house. (An adventure in itself. Had to drill some holes in the mount’s pipe, then figure out how to lift Dishy up 15 feet on a ladder and place it in the mount. Then hold it in place, some 20 pounds, with one hand while trying to thread through the bolt with the other. I am not good at this kind of thing.)

After my ground experiments I was worried there’d be too many obstructions. Glad I tried it anyway; turns out to be fine. On the ground I think at worst it reported 1 hour of obstructions in 12 hours of operation and in general it was complaining things weren’t right. Up on the mount it’s reported 2-4 minutes of obstructions in 12 hours. That may still be unacceptable; if those 2-4 minutes are actual outages it will be quite disruptive. But it seems to be working well right now.

Optional Gateway setup

The network setup has been tricky. I got some good advice on Reddit about how to set up the Starlink as a secondary Internet gateway. I want most of my house to keep using my old ISP but have Starlink as an option on the LAN while I test it. Turns out to be relatively simple to set that up, the basic idea is to set up a router for the Starlink without DHCP and then manually choose to use that as your Internet gateway. But it does require a second router and some manual fiddling on the clients you want to use Starlink. Note the router Starlink sends you is not flexible enough to do this; you need something more hacker friendly. I’m using an old TP-Link Archer C7 running OpenWRT.

(Fun side fact; this old OpenWRT I have has a funky configuration on sshd. To talk to it I have to do this
ssh -oKexAlgorithms=+diffie-hellman-group1-sha1 root@192.168.3.233)

My regular LAN is 192.168.3.* with a router running on 3.1 serving DHCP in the range of 3.6-3.223. Dishy is plugged into the WAN port on a second router with DHCP and WiFi turned off. The router is configured with a static IP address of 192.168.3.233 and the LAN side of the router is plugged into the LAN ethernet switch with everything else. So other computers on my LAN work just like normal and are mostly unaware of the Starlink and its router. However if I want I can statically configure a machine to use 192.168.3.233 as its Internet gateway instead of the DHCP default of 192.168.3.1 and now it’ll be using Starlink. It’s a bit of a PITA; in most operating systems if you want to change any network parameters you have to go full static config. But that’s not too hard.

My current static config:
192.168.3.233: the Starlink router
192.168.3.230: my desktop PC
192.168.3.213: my phone

The first setup I had for Starlink was even simpler; I just used their router and gave it a different WiFi SSID. Didn’t plug it into the LAN at all. That meant any wireless device could choose to use Starlink in my house just by choosing the other SSID. But that doesn’t give my full LAN access. Handy hack, not a great long term solution. (I sure wish I could tell OpenWRT to run DHCP on the wireless network only, though.

A hilarious mistake the first time I set up the second router; the Starlink router had DHCPv6 turned on in a hidden setting I didn’t notice. All the Windows machines in the house noticed the new option and switched to IPv6 preferentially. Amazingly that all worked and all my PCs were suddenly about to talk to the IPv6 net via Starlink! Google worked great, but I think sites that were IPv4 only weren’t working in this Frankenstein configuration. I don’t normally use IPv6 ever.

The public address the rest of the Internet sees when I’m on Starlink is 206.214.236.64 at the moment. That’s a tiny little block that belongs to SpaceX: 206.214.236.64/27, or 32 addresses. Presumably they have more blocks?

I think I’m going to be living with this opt-in configuration for awhile. Right now I have to go statically configure every device I want to swap over; my desktop, my game console, my Roku… It’d be better if I had a smarter DHCP server that I could direct to give out a different gateway depending on who is asking. Pretty sure Ubiquiti’s DHCP server can’t do that, but I bet some Linux option is flexible enough.

Alternately I wonder if there’s a way to leverage VLANs to do this. Have one VLAN use one gateway, the other use the other. But I want my devices to all mostly be on the same LAN regardless, so that doesn’t make a lot of sense.

Static route for Dishy at 192.168.100.1

Dishy itself is set up kind of like a cable modem. It runs DHCP on its output port (the one plugged into the WAN) and is handing out an address like 100.65.6.117 with a big fat netmask of 255.192.0.0. Gateway is 100.127.255.0. Turns out that’s a reserved shared address space, a special IP block kind of like 10.* or 192.168.* but reserved specifically for the needs of carrier grade NAT. Yup, Starlink is CGNAT, in addition to your home NAT. (But see also IPv6).

Dishy also is listening on 192.168.100.1; that’s where all the diagnostic services live. (More on that later). That means it’s handy to set up a static route for that. Here’s how I set it up in OpenWRT on the Starlink router at 192.168.3.233 so that any device using that as its gateway could talk to Dishy.

Static route on OpenWRT

I tried adding a similar static route for 192.168.100.1 on my Ubiquiti main house router (with a gateway of 192.168.3.233). But while that route works locally on the Ubiquiti machine itself, nothing else in the house seemed to work right with it. Not sure what’s going on; packets for 192.168.*.* seem to be routed out my WAN interface and into my ISP! That’s not right and I have no idea why it’s happening.

On my Linux server I managed to connect directly to the Dishy diagnostics by doing this:

ip route add 192.168.3.0/24 dev eno1
ip route add 192.168.100.1/32 via 192.168.3.233

Weird I had to add the ethernet route for my subnet first! If I didn’t do that I got “Error: Nexthop has invalid gateway” because the routing table had no idea how to get to 192.168.3.233. This was the routing table my Linux box had before I touched it:

Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.3.1     0.0.0.0         UG    0      0        0 eno1
172.17.0.0      0.0.0.0         255.255.0.0     U     0      0        0 docker0

If I read that right all packets out of the Linux box were sent to the router, even for other hosts on the same LAN. Ping timings confirm that. That’s unnecessary, right? Adding an interface route for the LAN seems the right thing. I didn’t make any of this permanent; doing so is going to require messing with Ubuntu’s hideous network config mess.

Performance

Overall speed is highly variable; sometimes I see 20 Mbps, sometimes I see 150 Mbps. I don’t have a good feel for what’s causing that yet, need to collect more data. Here’s a DSL reports speed test I ran just now. (Hilariously you have to lie and say you’re on a WISP; it refuses to believe something this low latency is satellite). Note the upload speed was 40Mbps for the first half of the test.

153 Mbps down, 15Mbps up

Latency is also very good, better than my ISP. I left a ping to 8.8.8.8 running for about 3 hours last evening and got these results”

N: 11482
Total: 406168.800000
Average: 35.374395
StdDev: 35.046247
Median: 30.800000
Min: 19.800000
Max: 2088.000000

However.. I sent 11758 ping packets and only got back 11482 responses. That’s 276 lost packets, or 2.3%, or about one a minute. That’s not great. Using my desktop interactively I’ve noticed a few small periods of the network not working, web page loads failing. The Starlink mobile app shows little outages in that period too. So that’s no good. I have yet to determine if TCP connections break in these outages or if they just get paused for a few seconds.

Update: see below about TCP connections. More testing reveals a packet loss of about 1-2% over long period of time. Compare 1.2% for my WISP during the busy evening when it’s congested. The difference with Starlink is when there’s an outage it’s typically like 5 seconds, not just a single packet.

There were 21 pings that took longer than 100ms, and 5 that took longer than 1 second. Worst was 2088ms.

Histogram of ping times

So that’s some preliminary stats; I’m very impressed. We’ll see how it feels after a few days of regular use. I also have a whole ‘nother blog post coming with more detailed information gathered from Dishy’s own diagnostics. Apparently there’s a whole gRPC server there folks have reverse engineered, so you can get the same data the mobile app shows.

Speaking of which, here are some screenshots from the mobile app:

Bottom Line

(Update after most of a day of using Starlink as my ISP)

It’s impressive. It works well. It definitely has some outages right now that make it not suitable today for a 100% reliable Internet connection. I trust these will be fixed in the next few months as they launch more satellites. Web surfing, etc mostly worked fine; occasionally I’d get an error that a page couldn’t be loaded and then when I reloaded a few seconds later it worked fine. Continuous sessions did not work well; I got booted out of World of Warcraft like 4 times in an hour. Unacceptable. I also couldn’t keep an idle ssh session open, probably because of aggressive NAT connection expiration but maybe for other reasons. I haven’t tried Zoom yet but I’m guessing it’ll will have to reconnect.

Update 2: continuous sessions work better than I thought. Idle ssh connections drop after 10 minutes, but put even modest trafficon them and they last hours. So do unadorned netcat client/server connections with traffic. WoW does disconnect when there’s momentary Starlink outages but that may well be WoW’s doing, deciding if it hears nothing after a few seconds the game is down. Zoom worked fine through the outages; I’d get a few second hang but the call would resume immediately with no trouble.

I ran a naive ping monitor to 8.8.8.8, 1 ping a second. Over 5 hours I sent about 18,000 pings and 126 failed; 0.7% rate. That doesn’t sound bad except the failures correspond to outages where the whole network seemed down, it wasn’t just a hiccup.

Bottom line; it works great for streaming media and loading web pages. Real time interactive stuff suffers a bit. Latency is excellent, it really is the 20-50ms advertised. But occasionally the link goes down for 4 seconds and sessions fail.

Why does the link go down? It’s the beta period; Starlink says this is expected behavior. Dishy’s stats says I’ve had 3 minutes downtime from obstructions, 45 seconds because of lack of satellites, and 3 minutes “beta downtime”. Looking closer I think all of that downtime will stop happening once there’s more satellite coverage. At least, I hope so. I’ve been following along on https://satellitemap.space/ and every outage I’ve experienced is correlated with a gap overhead.

I’m a little less clear why sessions go down. TCP is designed to handle packet loss, but his is acting like the TCP sockets are actively terminated when the link is down for just a few seconds. That seems completely unnecessary! I’d love to talk to a Starlink engineer to understand why. Update: after some more experimenting, basic TCP sessions don’t go down. Idle TCP sessions get cancelled after 10 minutes, a common NAT problem. And some applications like World of Warcraft seem to decide the link is dead when Starlink has a 5 second outage. But active ssh sessions and simple netcat servers persist through the short Starlink outages.

I’m super excited about this new service. I’ll keep using it daily for awhile and see how it feels. Also curious now about bonding with another ISP link for more reliability. Speedify looks promising.

My house is just east of the Arbuckle ground station

Updates (March 18)

Since writing this post I’ve been using Starlink more and more and made some changes to make it easier.

My Linux server and my Windows desktop now route through the Speedify VPN. This bonds Starlink with my second ISP link to make a more reliable Internet connection.

I’ve set up isc-dhcp-server on my Ubuntu box to serve DHCP and have it handing out special Starlink gateways to hosts based on MAC address. No more manual configuration of each machine! (Fun fact: Roku devices are incapable of having their network adapters configured without DHCP.) There are more sophisticated ways to do this using the “class” configs, but this works for something simple.

group {
option routers 192.168.3.233;
host ps5 { hardware ethernet 78:c8:81:bb:65:15; }
host roku-office { hardware ethernet cc:a1:2b:93:b6:1f; }
}

My first plotter art

I bought myself an AxiDraw pen plotter. Inspired by the work of my friends Joshua Schachter and Michal Migurski, not to mention the wide world of #PlotterTwitter, I figured I’d take a turn at doing some pen art. It’s been a long time since I programmed any original generative art of my own and I’ve never really worked with pens and paper, so it seemed fun. Here’s my first work.

Yes, its dicks. A lot of people’s first plotter piece is small multiples of things with slight variance. But why draw lots of circles when you can draw lots of dicks? I confess I still have the sense of humor of a 12 year old. Also I like dicks. I had in mind Keith Haring or something when planning this although I’ve quickly found owning a fancy plotter and knowing some Python does not make you Keith Haring. His dicks are so charming!

Learned a lot doing this. Most of it is about the fussiness of pen on paper. Pens all have different rates of ink flow and papers have different absorbancy. I was doing my first tests with sharpie on cheap printer paper which bleeds all over the damn place; the moment the pen dwells for even half a second you get a big blob. This is a somewhat inexpensive Lamy Safari Fountain Pen, a classic beginner fountain pen. Writes like a dream; I may switch to it for ordinary writing too! Good for thin precise lines. I also bought a whole bunch of other kinds of markers and pens including some meant to simulate watercolors.

The AxiDraw is a neat device too. It’s basically a metal framework with two very high precision servos (2800 dpi!) to move the pen to any X, Y coordinates. Also a simple pen-up/down device to lift the pen off the paper or let it rest on its own weight. That’s it; no pen pressure, no paper movement. Just pen up/down and move to x, y. It comes with an Inkscape plugin that can draw any SVG file you give it, and from that you can draw pretty much any vector graphics. The SVG interpreter has some limits; no masking. There is sort of hacked on support for text fonts and for fills, although really it’s about drawing lines.

This first piece I did in PyCairo, the classic drawing library. I don’t have much love for it. It’s very confusing about when you’re working in relative vs absolute coordinates and it has some vesitgal confusing path building thing that makes drawing arcs more confusing than it has to be. The big strength of PyCairo is you can draw in high level code and then render to SVG, PDF, various raster formats.. And it rasterizes beautifully. I don’t need all of that. My plan now is to re-write my dicks thing in a lower level SVG framework, maybe svgwrite. I want something like D3 for producing SVG drawings, just a little DOM manipulation help. An alternate approach is to do everything in Shapely, make an object scene graph. Shapely has SVG rendering support.

Beyond that I have a host of other arty projects in mind. Bringing in geographic data is an obvious thing; may be time to revisit my old watershed maps, rivers, even just a simple OSM map renderer could be fun. I’m hoping to get inspired by the messiness of the ink medium, I’ve seen some neat work from other artists that really leans into the aesthetic of slightly scratchy lines of slightly bleeding ink.

Starlink install notes

Thinking seriously where I will install my Starlink. I have an existing antenna mount that was for an old fixed wireless system that’d’ be perfect. But it has some obstructions to the north. From what I’ve read on /r/starlink obstructions are definitely a problem. They translate into short outages where you have no Internet. It’s still OK for bulk downloads, they just resume after the 15s outage. But I’d find that absolutely intolerable myself, particularly for anything interactive. That’s been my experience so far with a temporary ground installation. The roof mount will be better but not a lot; the ground install is experiencing 2 minutes of obstruction outages frequently.

The obstruction cone is going to get smaller over time as they add more satellites. I expect it’s worse down where I am at 39N, which is about as far south as they advertise service right now.

Here’s the worst of the obstructions. These trees will be good and leafy in a few months. And they’re right to the north, the worst place to be. Note the stuff in the dark grey area isn’t a problem, that’s below Dishy’s field of view. But everything above is.

The existing antenna mount. The pipe facing up is between 1 1/2″ and 2″ in diameter (it’s no longer round).

And a fun photo of my old 1 Mbps fixed wireless link (that was on that mount) with Dishy.

The flat platform that the bottom half of that mount is welded to is 5″ x 7″. The bolt holes are inset a small amount, roughly half an inch? I was wondering if I could use Starlink’s Volcano mount to mount to it. Their plate is bigger, 6″ x 9″. But I think I can drill holes to match my 5×7 mount; the plate bulges out to meet the pipe but it looks like there’s enough room. Willing to take the $24 risk at least.

BTW SpaceX is not providing detailed engineering drawings of all this mounting stuff. Dumb! There’s some fairly precise measurements of the pole in this reddit post. Here’s some dimensions on the Volcano mount. Also a popular photo set, someone measuring the stuff with calipers and displaying all the measurements upside down.