There is a possible deadlock in `device.Close()` when you try to close
the device very soon after its start. The problem is that two different
methods acquire the same locks in different order:
1. device.Close()
- device.ipcMutex.Lock()
- device.state.Lock()
2. device.changeState(deviceState)
- device.state.Lock()
- device.ipcMutex.Lock()
Reproducer:
func TestDevice_deadlock(t *testing.T) {
d := randDevice(t)
d.Close()
}
Problem:
$ go clean -testcache && go test -race -timeout 3s -run TestDevice_deadlock ./device | grep -A 10 sync.runtime_SemacquireMutex
sync.runtime_SemacquireMutex(0xc000117d20?, 0x94?, 0x0?)
/usr/local/opt/go/libexec/src/runtime/sema.go:77 +0x25
sync.(*Mutex).lockSlow(0xc000130518)
/usr/local/opt/go/libexec/src/sync/mutex.go:171 +0x213
sync.(*Mutex).Lock(0xc000130518)
/usr/local/opt/go/libexec/src/sync/mutex.go:90 +0x55
golang.zx2c4.com/wireguard/device.(*Device).Close(0xc000130500)
/Users/martin.basovnik/git/basovnik/wireguard-go/device/device.go:373 +0xb6
golang.zx2c4.com/wireguard/device.TestDevice_deadlock(0x0?)
/Users/martin.basovnik/git/basovnik/wireguard-go/device/device_test.go:480 +0x2c
testing.tRunner(0xc00014c000, 0x131d7b0)
--
sync.runtime_SemacquireMutex(0xc000130564?, 0x60?, 0xc000130548?)
/usr/local/opt/go/libexec/src/runtime/sema.go:77 +0x25
sync.(*Mutex).lockSlow(0xc000130750)
/usr/local/opt/go/libexec/src/sync/mutex.go:171 +0x213
sync.(*Mutex).Lock(0xc000130750)
/usr/local/opt/go/libexec/src/sync/mutex.go:90 +0x55
sync.(*RWMutex).Lock(0xc000130750)
/usr/local/opt/go/libexec/src/sync/rwmutex.go:147 +0x45
golang.zx2c4.com/wireguard/device.(*Device).upLocked(0xc000130500)
/Users/martin.basovnik/git/basovnik/wireguard-go/device/device.go:179 +0x72
golang.zx2c4.com/wireguard/device.(*Device).changeState(0xc000130500, 0x1)
Signed-off-by: Martin Basovnik <martin.basovnik@gmail.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Access to Peer.endpoint was previously synchronized by Peer.RWMutex.
This has now moved to Peer.endpoint.Mutex. Peer.SendBuffers() is now the
sole caller of Endpoint.ClearSrc(), which is signaled via a new bool,
Peer.endpoint.clearSrcOnTx. Previous Callers of Endpoint.ClearSrc() now
set this bool, primarily via peer.markEndpointSrcForClearing().
Peer.SetEndpointFromPacket() clears Peer.endpoint.clearSrcOnTx when an
updated conn.Endpoint is stored. This maintains the same event order as
before, i.e. a conn.Endpoint received after peer.endpoint.clearSrcOnTx
is set, but before the next Peer.SendBuffers() call results in the
latest conn.Endpoint source being used for the next packet transmission.
These changes result in throughput improvements for single flow,
parallel (-P n) flow, and bidirectional (--bidir) flow iperf3 TCP/UDP
tests as measured on both Linux and Windows. Latency under load improves
especially for high throughput Linux scenarios. These improvements are
likely realized on all platforms to some degree, as the changes are not
platform-specific.
Co-authored-by: James Tucker <james@tailscale.com>
Signed-off-by: James Tucker <james@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Queue{In,Out}boundElement locking can contribute to significant
overhead via sync.Mutex.lockSlow() in some environments. These types
are passed throughout the device package as elements in a slice, so
move the per-element Mutex to a container around the slice.
Reviewed-by: Maisem Ali <maisem@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
If an IPC operation is in flight while close starts, it is possible for
both processes to deadlock. Prevent this by taking the IPC lock at the
start of close and for the duration.
Signed-off-by: James Tucker <jftucker@gmail.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Accept packet vectors for reading and writing in the tun.Device and
conn.Bind interfaces, so that the internal plumbing between these
interfaces now passes a vector of packets. Vectors move untouched
between these interfaces, i.e. if 128 packets are received from
conn.Bind.Read(), 128 packets are passed to tun.Device.Write(). There is
no internal buffering.
Currently, existing implementations are only adjusted to have vectors
of length one. Subsequent patches will improve that.
Also, as a related fixup, use the unix and windows packages rather than
the syscall package when possible.
Co-authored-by: James Tucker <james@tailscale.com>
Signed-off-by: James Tucker <james@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
For some reason, this was omitted for response messages.
Reported-by: z <dzm@unexpl0.red>
Fixes: 8c34c4c ("First set of code review patches")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Heavier network extensions might require the wireguard-go component to
use less ram, so let users of this reduce these as needed.
At some point we'll put this behind a configuration method of sorts, but
for now, just expose the consts as vars.
Requested-by: Josh Bleecher Snyder <josh@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Instead of hard-coding exactly two sources from which
to receive packets (an IPv4 source and an IPv6 source),
allow the conn.Bind to specify a set of sources.
Beneficial consequences:
* If there's no IPv6 support on a system,
conn.Bind.Open can choose not to return a receive function for it,
which is simpler than tracking that state in the bind.
This simplification removes existing data races from both
conn.StdNetBind and bindtest.ChannelBind.
* If there are more than two sources on a system,
the conn.Bind no longer needs to add a separate muxing layer.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
There's no way for len(peers)==0 when a current peer has
isRunning==false.
This requires some struct reshuffling so that the uint64 pointer is
aligned.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Since RoutineHandshake calls peer.SendKeepalive(), it potentially is a
writer into the encryption queue, so we need to bump the wg count.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
RoutineReadFromTUN can trigger a call to SendStagedPackets.
SendStagedPackets attempts to protect against sending
on the encryption queue by checking peer.isRunning and device.isClosed.
However, those are subject to TOCTOU bugs.
If that happens, we get this:
goroutine 1254 [running]:
golang.zx2c4.com/wireguard/device.(*Peer).SendStagedPackets(0xc000798300)
.../wireguard-go/device/send.go:321 +0x125
golang.zx2c4.com/wireguard/device.(*Device).RoutineReadFromTUN(0xc000014780)
.../wireguard-go/device/send.go:271 +0x21c
created by golang.zx2c4.com/wireguard/device.NewDevice
.../wireguard-go/device/device.go:315 +0x298
Fix this with a simple, big hammer: Keep the encryption queue
alive as long as it might be written to.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
It's never used and we won't have a use for it. Also, move to go-running
stringer, for those without GOPATHs.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
The test previously checked the offset within a substruct, not the
offset within the allocated struct, so this adds the two together.
It then fixes an alignment crash on 32-bit machines.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Part of being actually idempotent is that we shouldn't penalize code
that takes advantage of this property with a log splat.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
We have a bunch of stupid channel tricks, and I'm about to add more.
Give them their own file. This commit is 100% code movement.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
The TUN event reader does three things: Change MTU, device up, and device down.
Changing the MTU after the device is closed does no harm.
Device up and device down don't make sense after the device is closed,
but we can check that condition before proceeding with changeState.
There's thus no reason to block device.Close on RoutineTUNEventReader exiting.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
This commit simplifies device state management.
It creates a single unified state variable and documents its semantics.
It also makes state changes more atomic.
As an example of the sort of bug that occurred due to non-atomic state changes,
the following sequence of events used to occur approximately every 2.5 million test runs:
* RoutineTUNEventReader received an EventDown event.
* It called device.Down, which called device.setUpDown.
* That set device.state.changing, but did not yet attempt to lock device.state.Mutex.
* Test completion called device.Close.
* device.Close locked device.state.Mutex.
* device.Close blocked on a call to device.state.stopping.Wait.
* device.setUpDown then attempted to lock device.state.Mutex and blocked.
Deadlock results. setUpDown cannot progress because device.state.Mutex is locked.
Until setUpDown returns, RoutineTUNEventReader cannot call device.state.stopping.Done.
Until device.state.stopping.Done gets called, device.state.stopping.Wait is blocked.
As long as device.state.stopping.Wait is blocked, device.state.Mutex cannot be unlocked.
This commit fixes that deadlock by holding device.state.mu
when checking that the device is not closed.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
It is no longer necessary, as of 454de6f3e64abd2a7bf9201579cd92eea5280996
(device: use channel close to shut down and drain decryption channel).
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
This makes the IpcGet method much faster.
We also refactor the traversal API to use a callback so that we don't
need to allocate at all. Avoiding allocations we do self-masking on
insertion, which in turn means that split intermediate nodes require a
copy of the bits.
benchmark old ns/op new ns/op delta
BenchmarkUAPIGet-16 3243 2659 -18.01%
benchmark old allocs new allocs delta
BenchmarkUAPIGet-16 35 30 -14.29%
benchmark old bytes new bytes delta
BenchmarkUAPIGet-16 1218 737 -39.49%
This benchmark is good, though it's only for a pair of peers, each with
only one allowedips. As this grows, the delta expands considerably.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
There are very few cases, if any, in which a user only wants one of
these levels, so combine it into a single level.
While we're at it, reduce indirection on the loggers by using an empty
function rather than a nil function pointer. It's not like we have
retpolines anyway, and we were always calling through a function with a
branch prior, so this seems like a net gain.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
This commit overhauls wireguard-go's logging.
The primary, motivating change is to use a function instead
of a *log.Logger as the basic unit of logging.
Using functions provides a lot more flexibility for
people to bring their own logging system.
It also introduces logging helper methods on Device.
These reduce line noise at the call site.
They also allow for log functions to be nil;
when nil, instead of generating a log line and throwing it away,
we don't bother generating it at all.
This spares allocation and pointless work.
This is a breaking change, although the fix required
of clients is fairly straightforward.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
This is similar to commit e1fa1cc556,
but for the decryption channel.
It is an alternative fix to f9f655567930a4cd78d40fa4ba0d58503335ae6a.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
It's possible for RoutineSequentialReceiver to try to lock an elem after
RoutineDecryption has exited. Before this meant we didn't then unlock
the elem, so the whole program deadlocked.
As well, it looks like the flush code (which is now potentially
unnecessary?) wasn't properly dropping the buffers for the
not-already-dropped case.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>