tcpGRO() was using an incorrect IPv4 more fragments bit mask.
tcpPacketsCanCoalesce() was not distinguishing tcp6 from tcp4, and TTL
values were not compared. TTL values should be equal at the IP layer,
otherwise the packets should not coalesce. This tracks with the kernel.
Reviewed-by: Denton Gentry <dgentry@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
IP options were not being compared prior to coalescing. They are not
commonly used. Disqualification due to nonzero options is in line with
the kernel.
Reviewed-by: Denton Gentry <dgentry@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Enable TCP SACK for the gVisor Stack used in tun/netstack. This can
improve throughput by an order of magnitude in the presence of packet
loss.
Reviewed-by: James Tucker <james@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
There's not really a use at the moment for making this configurable, and
once bind_windows.go behaves like bind_std.go, we'll be able to use
constants everywhere. So begin that simplification now.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Implement TCP offloading via TSO and GRO for the Linux tun.Device, which
is made possible by virtio extensions in the kernel's TUN driver.
Delete conn.LinuxSocketEndpoint in favor of a collapsed conn.StdNetBind.
conn.StdNetBind makes use of recvmmsg() and sendmmsg() on Linux. All
platforms now fall under conn.StdNetBind, except for Windows, which
remains in conn.WinRingBind, which still needs to be adjusted to handle
multiple packets.
Also refactor sticky sockets support to eventually be applicable on
platforms other than just Linux. However Linux remains the sole platform
that fully implements it for now.
Co-authored-by: James Tucker <james@tailscale.com>
Signed-off-by: James Tucker <james@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Accept packet vectors for reading and writing in the tun.Device and
conn.Bind interfaces, so that the internal plumbing between these
interfaces now passes a vector of packets. Vectors move untouched
between these interfaces, i.e. if 128 packets are received from
conn.Bind.Read(), 128 packets are passed to tun.Device.Write(). There is
no internal buffering.
Currently, existing implementations are only adjusted to have vectors
of length one. Subsequent patches will improve that.
Also, as a related fixup, use the unix and windows packages rather than
the syscall package when possible.
Co-authored-by: James Tucker <james@tailscale.com>
Signed-off-by: James Tucker <james@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
This seems like a much better demonstration as it removes the need for
external components.
Signed-off-by: Søren L. Hansen <sorenisanerd@gmail.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
To build with go1.19, gvisor needs
99325baf ("Bump gVisor build tags to go1.19").
However gvisor.dev/gvisor/pkg/tcpip/buffer is no longer available,
so refactor to use gvisor.dev/gvisor/pkg/tcpip/link/channel directly.
Signed-off-by: Shengjing Zhu <i@zhsj.me>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Use unix.ByteSliceToString in (*NativeTun).nameSlice to convert the
TUNGETIFF ioctl result []byte to a string.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Alexander Neumann <alexander.neumann@redteam-pentesting.de>
[Jason: don't wrap deadline error.]
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
This commit fixes all callsites of netip.AddrFromSlice(), which has
changed its signature and now returns two values.
Signed-off-by: Alexander Neumann <alexander.neumann@redteam-pentesting.de>
[Jason: remove error handling from AddrFromSlice.]
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Provide a PacketConn interface for netstack's ICMP endpoint; netstack
currently only provides EchoRequest/EchoResponse ICMP support, so this
code exposes only an interface for doing ping.
Signed-off-by: Thomas Ptacek <thomas@sockpuppet.org>
[Jason: rework structure, match std go interfaces, add example code]
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
There are more places where we'll need to add it later, when Go 1.18
comes out with support for it in the "net" package. Also, allowedips
still uses slices internally, which might be suboptimal.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Update gvisor to v0.0.0-20211020211948-f76a604701b6, which requires some
changes to tun.go:
WriteRawPacket: Add function with not implemented error.
CreateNetTUN: Replace stack.AddAddress with stack.AddProtocolAddress, and
fix IPv6 address in error message.
Signed-off-by: Mikael Magnusson <mikma@users.sourceforge.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
(*NativeTun).operateOnFd is only used on darwin and freebsd. Adjust the
build tags accordingly.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
At these points, the socket file descriptor is not yet wrapped in an
*os.File, so it needs to be closed explicitly on error.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Otherwise recent WDK binaries fail on ARM64, where an exception handler
is used for trapping an illegal instruction when ARMv8.1 atomics are
being tested for functionality.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
The reason this was failing before is that dloadsup.h's
DloadObtainSection was doing a linear search of sections to find which
header corresponds with the IMAGE_DELAYLOAD_DESCRIPTOR section, and we
were stupidly overwriting the VirtualSize field, so the linear search
wound up matching the .text section, which then it found to not be
marked writable and failed with FAST_FAIL_DLOAD_PROTECTION_FAILURE.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Probably a bad idea, but we don't currently support it, and those huge
windows.NewCallback trampolines make juicer targets anyway.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
On Linux we can run `ip link del wg0`, in which case the fd becomes
stale, and we should exit. Since this is an intentional action, don't
treat it as an error.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
In 097af6e ("tun: windows: protect reads from closing") we made sure no
functions are running when End() is called, to avoid a UaF. But we still
need to kick that event somehow, so that Read() is allowed to exit, in
order to release the lock. So this commit calls SetEvent, while moving
the closing boolean to be atomic so it can be modified without locks,
and then moves to a WaitGroup for the RCU-like pattern.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>