Commit Graph

234 Commits

Author SHA1 Message Date
Jason A. Donenfeld
2dd424e2d8 device: handle peer post config on blank line
We missed a function exit point. This was exacerbated by e3134bf
("device: defer state machine transitions until configuration is
complete"), but the bug existed prior. Minus provided the following
useful reproducer script:

    #!/usr/bin/env bash

    set -eux

    make wireguard-go || exit 125

    ip netns del test-ns || true
    ip netns add test-ns
    ip link add test-kernel type wireguard
    wg set test-kernel listen-port 0 private-key <(echo "QMCfZcp1KU27kEkpcMCgASEjDnDZDYsfMLHPed7+538=") peer "eDPZJMdfnb8ZcA/VSUnLZvLB2k8HVH12ufCGa7Z7rHI=" allowed-ips 10.51.234.10/32
    ip link set test-kernel netns test-ns up
    ip -n test-ns addr add 10.51.234.1/24 dev test-kernel
    port=$(ip netns exec test-ns wg show test-kernel listen-port)

    ip link del test-go || true
    ./wireguard-go test-go
    wg set test-go private-key <(echo "WBM7qimR3vFk1QtWNfH+F4ggy/hmO+5hfIHKxxI4nF4=") peer "+nj9Dkqpl4phsHo2dQliGm5aEiWJJgBtYKbh7XjeNjg=" allowed-ips 0.0.0.0/0 endpoint 127.0.0.1:$port
    ip addr add 10.51.234.10/24 dev test-go
    ip link set test-go up

    ping -c2 -W1 10.51.234.1

Reported-by: minus <minus@mnus.de>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-11-29 12:31:54 -05:00
Josh Bleecher Snyder
387f7c461a device: reduce peer lock critical section in UAPI
The deferred RUnlock calls weren't executing until all peers
had been processed. Add an anonymous function so that each
peer may be unlocked as soon as it is completed.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-11-23 22:03:15 +01:00
Josh Bleecher Snyder
4d87c9e824 device: remove code using unsafe
There is no performance impact.

name                             old time/op  new time/op  delta
TrieIPv4Peers100Addresses1000-8  78.6ns ± 1%  79.4ns ± 3%    ~     (p=0.604 n=10+9)
TrieIPv4Peers10Addresses10-8     29.1ns ± 2%  28.8ns ± 1%  -1.12%  (p=0.014 n=10+9)
TrieIPv6Peers100Addresses1000-8  78.9ns ± 1%  78.6ns ± 1%    ~     (p=0.492 n=10+10)
TrieIPv6Peers10Addresses10-8     29.3ns ± 2%  28.6ns ± 2%  -2.16%  (p=0.000 n=10+10)

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-11-23 22:03:15 +01:00
Jason A. Donenfeld
ef8d6804d7 global: use netip where possible now
There are more places where we'll need to add it later, when Go 1.18
comes out with support for it in the "net" package. Also, allowedips
still uses slices internally, which might be suboptimal.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-11-23 22:03:15 +01:00
Jason A. Donenfeld
de7c702ace device: only propagate roaming value before peer is referenced elsewhere
A peer.endpoint never becomes nil after being not-nil, so creation is
the only time we actually need to set this. This prevents a race from
when the variable is actually used elsewhere, and allows us to avoid an
expensive atomic.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-11-16 21:16:04 +01:00
Jason A. Donenfeld
fc4f975a4d device: align 64-bit atomic member in Device
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-11-16 21:07:31 +01:00
Jason A. Donenfeld
9d699ba730 device: start peers before running handshake test
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-11-16 21:07:31 +01:00
David Anderson
3cae233d69 device: fix nil pointer dereference in uapi read
Signed-off-by: David Anderson <danderson@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-11-16 20:43:26 +01:00
Jason A. Donenfeld
111e0566dc device: make new peers inherit broken mobile semantics
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-11-15 23:40:47 +01:00
Jason A. Donenfeld
e3134bf665 device: defer state machine transitions until configuration is complete
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-11-15 23:40:47 +01:00
Jason A. Donenfeld
63abb5537b device: do not consume handshake messages if not running
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-11-15 23:40:47 +01:00
Jason A. Donenfeld
eb6302c7eb device: timers: use pre-seeded per-thread unlocked fastrandn for jitter
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-10-28 13:47:50 +02:00
Jason A. Donenfeld
60683d7361 device: timers: seed unsafe rng before use for jitter
Forgetting to seed the unsafe rng, the jitter before followed a fixed
pattern, which didn't help when a fleet of computers all boot at once.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-10-28 13:34:21 +02:00
Jason A. Donenfeld
dfd688b6aa global: remove old-style build tags
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-10-12 12:02:10 -06:00
Jason A. Donenfeld
2ef39d4754 global: add new go 1.17 build comments
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-09-05 16:00:43 +02:00
Jason A. Donenfeld
f9b48a961c device: zero out allowedip node pointers when removing
This should make it a bit easier for the garbage collector.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-06-04 16:33:28 +02:00
Jason A. Donenfeld
d0cf96114f device: limit allowedip fuzzer a to 4 times through
Trying this for every peer winds up being very slow and precludes it
from acceptable runtime in the CI, so reduce this to 4.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-06-03 18:22:50 +02:00
Jason A. Donenfeld
841756e328 device: simplify allowedips lookup signature
The inliner should handle this for us.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-06-03 16:29:43 +02:00
Jason A. Donenfeld
c382222eab device: remove nodes by peer in O(1) instead of O(n)
Now that we have parent pointers hooked up, we can simply go right to
the node and remove it in place, rather than having to recursively walk
the entire trie.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-06-03 16:29:43 +02:00
Jason A. Donenfeld
b41f4cc768 device: remove recursion from insertion and connect parent pointers
This makes the insertion algorithm a bit more efficient, while also now
taking on the additional task of connecting up parent pointers. This
will be handy in the following commit.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-06-03 15:08:42 +02:00
Jason A. Donenfeld
4a57024b94 device: reduce size of trie struct
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-06-03 13:51:03 +02:00
Jason A. Donenfeld
c27ff9b9f6 device: allow reducing queue constants on iOS
Heavier network extensions might require the wireguard-go component to
use less ram, so let users of this reduce these as needed.

At some point we'll put this behind a configuration method of sorts, but
for now, just expose the consts as vars.

Requested-by: Josh Bleecher Snyder <josh@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-05-22 01:00:51 +02:00
Jason A. Donenfeld
99e8b4ba60 tun: linux: account for interface removal from outside
On Linux we can run `ip link del wg0`, in which case the fd becomes
stale, and we should exit. Since this is an intentional action, don't
treat it as an error.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-05-20 18:26:01 +02:00
Jason A. Donenfeld
9087e444e6 device: optimize Peer.String even more
This reduces the allocation, branches, and amount of base64 encoding.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-05-18 17:43:53 +02:00
Josh Bleecher Snyder
25ad08a591 device: optimize Peer.String
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-05-14 00:37:30 +02:00
Jason A. Donenfeld
7121927b87 device: add ID to repeated routines
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-05-07 12:21:21 +02:00
Jason A. Donenfeld
326aec10af device: remove unusual ... in messages
We dont use ... in any other present progressive messages except these.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-05-07 12:17:41 +02:00
Jason A. Donenfeld
efb8818550 device: avoid verbose log line during ordinary shutdown sequence
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-05-07 09:39:06 +02:00
Josh Bleecher Snyder
60a26371f4 device: log all errors received by RoutineReceiveIncoming
When debugging, it's useful to know why a receive func exited.

We were already logging that, but only in the "death spiral" case.
Move the logging up, to capture it always.
Reduce the verbosity, since it is not an error case any more.
Put the receive func name in the log line.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-05-06 11:22:13 +02:00
Jason A. Donenfeld
c7cd2c9eab device: don't defer unlocking from loop
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-12 16:19:35 -06:00
Jason A. Donenfeld
54dbe2471f conn: reconstruct v4 vs v6 receive function based on symtab
This is kind of gross but it's better than the alternatives.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-12 15:35:32 -06:00
Kristupas Antanavičius
d2fd0c0cc0 device: allocate new buffer in receive death spiral
Note: this bug is "hidden" by avoiding "death spiral" code path by
6228659 ("device: handle broader range of errors in RoutineReceiveIncoming").

If the code reached "death spiral" mechanism, there would be multiple
double frees happening. This results in a deadlock on iOS, because the
pools are fixed size and goroutine might stop until somebody makes
space in the pool.

This was almost 100% repro on the new ARM Macbooks:

- Build with 'ios' tag for Mac. This will enable bounded pools.
- Somehow call device.IpcSet at least couple of times (update config)
- device.BindUpdate() would be triggered
- RoutineReceiveIncoming would enter "death spiral".
- RoutineReceiveIncoming would stall on double free (pool is already
  full)
- The stuck routine would deadlock 'device.closeBindLocked()' function
  on line 'netc.stopping.Wait()'

Signed-off-by: Kristupas Antanavičius <kristupas.antanavicius@nordsec.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-12 11:14:53 -06:00
Josh Bleecher Snyder
10533c3e73 all: make conn.Bind.Open return a slice of receive functions
Instead of hard-coding exactly two sources from which
to receive packets (an IPv4 source and an IPv6 source),
allow the conn.Bind to specify a set of sources.

Beneficial consequences:

* If there's no IPv6 support on a system,
  conn.Bind.Open can choose not to return a receive function for it,
  which is simpler than tracking that state in the bind.
  This simplification removes existing data races from both
  conn.StdNetBind and bindtest.ChannelBind.
* If there are more than two sources on a system,
  the conn.Bind no longer needs to add a separate muxing layer.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-04-02 11:07:08 -06:00
Josh Bleecher Snyder
6228659a91 device: handle broader range of errors in RoutineReceiveIncoming
RoutineReceiveIncoming exits immediately on net.ErrClosed,
but not on other errors. However, for errors that are known
to be permanent, such as syscall.EAFNOSUPPORT,
we may as well exit immediately instead of retrying.

This considerably speeds up the package device tests right now,
because the Bind sometimes (incorrectly) returns syscall.EAFNOSUPPORT
instead of net.ErrClosed.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-03-30 12:41:43 -07:00
Josh Bleecher Snyder
02e419ed8a device: rename unsafeCloseBind to closeBindLocked
And document a bit.
This name is more idiomatic.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-03-30 12:07:12 -07:00
Jason A. Donenfeld
5f0c8b942d device: signal to close device in separate routine
Otherwise we wind up deadlocking.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-03-11 09:29:10 -07:00
Jason A. Donenfeld
593658d975 device: get rid of peers.empty boolean in timersActive
There's no way for len(peers)==0 when a current peer has
isRunning==false.

This requires some struct reshuffling so that the uint64 pointer is
aligned.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-03-06 08:44:38 -07:00
Jason A. Donenfeld
3c11c0308e conn: implement RIO for fast Windows UDP sockets
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-02-25 15:08:08 +01:00
Jason A. Donenfeld
f9dac7099e global: remove TODO name graffiti
Googlers have a habit of graffiting their name in TODO items that then
are never addressed, and other people won't go near those because
they're marked territory of another animal. I've been gradually cleaning
these up as I see them, but this commit just goes all the way and
removes the remaining stragglers.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-02-23 20:00:57 +01:00
Jason A. Donenfeld
9a29ae267c device: test up/down using virtual conn
This prevents port clashing bugs.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-02-23 20:00:57 +01:00
Jason A. Donenfeld
6603c05a4a device: cleanup unused test components
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-02-23 20:00:57 +01:00
Jason A. Donenfeld
a4f8e83d5d conn: make binds replacable
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-02-23 20:00:57 +01:00
Jason A. Donenfeld
c69481f1b3 device: disable waitpool tests
This code is stable, and the test is finicky, especially on high core
count systems, so just disable it.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-02-22 15:26:47 +01:00
Jason A. Donenfeld
8bf4204d2e global: stop using ioutil
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-02-17 22:19:27 +01:00
Jason A. Donenfeld
4e439ea10e conn: bump to 1.16 and get rid of NetErrClosed hack
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-02-16 21:05:25 +01:00
Jason A. Donenfeld
c7b7998619 device: remove old version file
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-02-12 17:59:50 +01:00
Jason A. Donenfeld
75e6d810ed device: use container/list instead of open coding it
This linked list implementation is awful, but maybe Go 2 will help
eventually, and at least we're not open coding the hlist any more.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-02-10 18:19:11 +01:00
Jason A. Donenfeld
747f5440bc device: retry Up() in up/down test
We're loosing our ownership of the port when bringing the device down,
which means another test process could reclaim it. Avoid this by
retrying for 4 seconds.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-02-10 01:01:37 +01:00
Jason A. Donenfeld
484a9fd324 device: flush peer queues before starting device
In case some old packets snuck in there before, this flushes before
starting afresh.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-02-10 00:39:28 +01:00
Jason A. Donenfeld
5bf8d73127 device: create peer queues at peer creation time
Rather than racing with Start(), since we're never destroying these
queues, we just set the variables at creation time.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-02-10 00:21:12 +01:00