Commit Graph

555 Commits

Author SHA1 Message Date
Simon Rozman
faf810e8cb Fix NdisQueryMdl() NULL-buffer check
Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-20 12:00:21 +02:00
Simon Rozman
9514ef37b3 Save some valuable lessons learned on Windows internals
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-20 11:54:58 +02:00
Simon Rozman
da3caadf48 Accept IRP_MJ_WRITE when paused but silently drop the packets
Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-20 11:54:58 +02:00
Simon Rozman
7004db43a7 Support surprise removal
WHLK 1903 CHAOS tests are reporting lots of issues regarding surprise
removal misbehavior:

WDTF_PNP: INFO  : Result: TestSurpriseRemove operation timed out waiting
   for IRP_MN_REMOVE_DEVICE..
WDTF_PNP: ERROR : Result: Failed to receive IRP_MN_REMOVE_DEVICE after
   receiving IRP_MN_SURPRISE_REMOVAL. Ensure that there are no open
   handles or references to the test device (in user mode or in kernel
   mode) preventing IRP_MN_REMOVE_DEVICE from being sent. You may need
   to terminate any processes or services that may have open user mode
   handles to this device.  ( 80004005 ).

Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-20 11:54:58 +02:00
Simon Rozman
85b4a769cf Replace PnP notifications with IRP_MJ_PNP dispatch handler
By replacing the NDIS' IRP_MJ_PNP dispatch handler we get the first
chance to clear the NBL queue to make NDIS proceed to TunPause() on
device removal.

This method is simpler than PnP notifications and we are chasing
surprise removal issues in WHLK tests. If this works, I'll hopefully
come back and update this commit message.

Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-20 11:54:58 +02:00
Simon Rozman
6b0293b18b Distinguish NDIS_STATUS and NTSTATUS
Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-20 11:54:58 +02:00
Simon Rozman
b07c268e0c Implement proper PnP notification re-registration on canceled removal
The Microsoft Documentation is clear:

"The PnP manager can still call the driver's notification callback
routine, but in such calls the file object in the NotificationStructure
is not valid."[1]

Therefore, we must not touch the notification->FileObject in
GUID_TARGET_DEVICE_REMOVE_CANCELLED.

"Because the driver closed the previous registration handle in response
to the query-remove notification, the driver must open a new handle. The
driver must:

1. Remove the old registration with IoUnregisterPlugPlayNotification.
2. Open a new handle to the device.
3. Reregister for notification on the new handle with
   IoRegisterPlugPlayNotification."

Therefore, let's do it. Unfortunately, in order to implement this, we
must save the driver object and device symbolic name.

[1](https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/handling-a-guid-target-device-query-remove-event)
[2](https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/handling-a-guid-target-device-remove-cancelled-event)

Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-20 11:54:58 +02:00
Simon Rozman
838251780e Rename ActiveTransactionCount to ActiveNBLCount
As ActiveTransacrionCount is all about counting NBLs in flight (or just
about to push some more NBLs), rename it to a more suitable name.

Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-20 11:54:58 +02:00
Simon Rozman
e3dfeeb2be Reuse Device.RefCount value to prevent duplicate status indications
Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-20 11:54:57 +02:00
Simon Rozman
54ee02ae34 Fix the cleanup order in TunDispatchWrite()
Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-20 11:54:57 +02:00
Simon Rozman
dc7a6fca11 Cleanup ref-counting overflow asserts
Asserting on 63-bit overflow seems a bit excessive.

While 31-bit overflow is more likely to happen, we should introduce a
real check if we are concerned about it. Rather than using an ASSERT in
Debug configuration run by probably nobody else but me.

Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-20 11:54:57 +02:00
Simon Rozman
70fcfdc85d Add missing state transition lock
When we are transitioning to a state that suspends some operations, we
must get an exclusive transition lock to wait for all operations that
have a shared transition lock. And TunHaltEx() is no exception.

Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-20 11:54:57 +02:00
Simon Rozman
60e103cad3 Revise adapter state checking
TunCheckForPause() was inlined and adjusted: it had two calls with two
potential new ones, but each call would have a slightly different
optimal variant: some with ActiveTransactionCount increment, others
without; some with Device.RefCount check, others without... No two
particular use cases were completely identical:

1. TunSendNetBufferLists:
   - requires ActiveTransactionCount increment
   - requires Device.RefCount > 0 check
   - reports NDIS_STATUS in case of rejection
2. IRP_MJ_READ:
   - no ActiveTransactionCount increment
   - no Device.RefCount > 0 check, as IRP_MJ_READ implies it
   - reports NTSTATUS in case of rejection
3. IRP_MJ_WRITE:
   - requires ActiveTransactionCount increment
   - no Device.RefCount > 0 check, as IRP_MJ_WRITE implies it
   - reports NTSTATUS in case of rejection
4. IRP_MJ_CREATE:
   - no ActiveTransactionCount increment
   - no Device.RefCount > 0 check to allow initial client connection
   - reports NTSTATUS in case of rejection, with slightly different
     status codes than IRP_MJ_READ and IRP_MJ_WRITE.

TUN_FLAGS_ENABLED was renamed to TUN_FLAGS_RUNNING: enabled/disabled
adapter means initialized&running/paused&halted in Windows world.

Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-20 11:54:57 +02:00
Simon Rozman
8162297c15 Count active transactions in NBLs rather than IRPs in receive direction
This makes ActiveTransactionCount a sum of:
- 1: initialized in TunRestart()
- count of NBLs in send queue
- count of NBLs in receive queue
- TunSendNetBufferLists() unfinished
- IRP_MJ_WRITE unfinished

Mind that we do not need to explicitly check for TUN_FLAGS_PRESENT early
in IRP_MJ_WRITE as it is implicitly checked by TunCheckForPause() along
with other flags required to run the packet flow. The later is actually
more correct by being shared-locked by the transition lock.

Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-20 11:54:57 +02:00
Simon Rozman
47cde0f340 Cleanup adapter power monitoring
Wintun adapter is always paused before transition to a low-power state.
Even on NDIS 6.30 we do not specify the NDIS_MINIPORT_ATTRIBUTES_NO_-
PAUSE_ON_SUSPEND flag. Since OID_PNP_SET_POWER does not do anything else
that TunPause() already takes care of it is redundant. The
TUN_FLAGS_ENABLED and TUN_FLAGS_POWERED were also mostly identical.

Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-20 11:54:56 +02:00
Simon Rozman
bce44d30e4 Use more appropriate status when rejecting NBLs with no client connected
Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-20 11:54:56 +02:00
Simon Rozman
2392d173e7 Restore adapter on PnP remove-cancel and reuse notification file object
In case target device removal was canceled, the adapter is now restored
to present state. This is a part of research why Wintun adapters are
misbehaving in some WHLK tests.

PnP notifications already provide FILE_OBJECT of the device we are
monitoring. We don't need to store it in adapter context.

Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-20 11:54:56 +02:00
Simon Rozman
5c8dafcb3f Simplify state machine
Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-20 11:54:56 +02:00
Sergei Mileshin
a530bb1b84 Set deny-all DACL instead of removing symlink on halting
Deleting symbolic link on device removal only still makes it possible to
open it from the real path.

Setting the deny-all DACL instead is a more reliable way of preventing
clients reopening the device when it is being removed.

Signed-off-by: Sergei Mileshin <msvsysproger@gmail.com>
Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-20 11:53:08 +02:00
Simon Rozman
2e744ebde5 Revise buffer size calculation to work across 32/64-bit boundary
Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-10 16:59:03 +02:00
Jason A. Donenfeld
c7fbe74d0e Version bump
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-06-08 09:04:47 +00:00
Jason A. Donenfeld
12e9f53fe2 Do not take handle reference with verifier enabled
This is actually very much wrong. In fact, it's bound to create all
sorts of nasty issues. But without it, we can't use the reference
function to check the validity of a potentially invalid handle while the
verifier is enabled.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-06-08 09:04:32 +00:00
Simon Rozman
c848006144 Clean TunWaitForReferencesToDropToZero()
Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-08 08:14:07 +02:00
Simon Rozman
5c2a9a991e Refactor IRP_MJ_CLOSE TransitionLock unlocking
Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-08 08:14:07 +02:00
Jason A. Donenfeld
9ee4310026 Document toolchain requirement
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-06-07 13:53:44 +02:00
Simon Rozman
ced1bcdd52 Reorder source code
Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-07 13:05:57 +02:00
Simon Rozman
bbbe72ce84 Remove documentation files from project file
...they are part of solution now.

Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-07 13:05:57 +02:00
Simon Rozman
b6de71ffcd Add wintun.sln for convenience
Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-07 13:05:57 +02:00
Simon Rozman
2a5da5c58f Revise constants requiring network-byte order
Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-07 12:28:07 +02:00
Simon Rozman
9a815147f6 Annotate service control constants
Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-07 12:28:07 +02:00
Simon Rozman
3306c6ec6c Use per-protocol NBL lists on write
Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-07 12:28:07 +02:00
Simon Rozman
9504191ba0 Unify interlocked reference counting
- Only positive values are valid.
- Add missing overflow assert checks
- Fix spacing

Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-07 12:28:07 +02:00
Simon Rozman
6c405efc42 README: 256 packets per exchange buffer limitation is obsolete
Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-07 12:28:07 +02:00
Simon Rozman
cc3c1fefcb Check number of packets in exchange buffer for overflow
Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-07 12:28:07 +02:00
Simon Rozman
fd6300dc6f Reduce IRP reference counter to 32-bit
This unifies 32 and 64-bit platforms to both use only one PVOID in IRP
extension now.

Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-07 12:28:07 +02:00
Simon Rozman
14617f118f Revise TransitionLock usage
The exclusive TransitionLock hold is required with all state changes
that:
- Switch adapter TUN_STATE_RUNNING -> TUN_STATE_PAUSING
- Switch media status to MediaConnectStateDisconnected
- Switch adapter to low-power state NdisDeviceStateD1-n

Because, those transitions require NBL queue to clear and stay empty.

The parts of the code that fill the NBL queue get a shared lock
before state check and release it after they are done filling the queue.

This prevents intermediate state changes while filling, and the queue
clearing parts of the code to wait for the filling parts to prevent a
race.

Starting up adapter (TUN_STATE_RESTARTING -> TUN_STATE_RUNNING,
MediaConnectStateDisconnected -> MediaConnectStateConnected, or
NdisDeviceStateDx -> NdisDeviceStateD0) does not require emptying NBL
queue and it does not need TransitionLock held.

This commit also moves TunQueueClear() out of exclusive TransitionLock
hold, because it is not necessary.

Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-07 12:28:07 +02:00
Simon Rozman
486b1c374b Simplify NdisMSendNetBufferListsComplete() flags
At the time of the NdisMSendNetBufferListsComplete() call, we're always
at Dispatch IRQL, because of ctx->TransitionLock being held.

Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-07 12:28:07 +02:00
Jason A. Donenfeld
ea26894a99 Delete symlink before forcing handles closed 2019-06-07 12:28:07 +02:00
Jason A. Donenfeld
86e425a0dc Document TransitionLock semantics and RCU intent
Rumor has it, NT is one of the few kernels that actually has RCU
(alongside Linux and maybe some IBM things). So maybe if we ever learn
how to use NT's RCU functionality, we'll be able to replace this with
that.
2019-06-07 12:28:07 +02:00
Jason A. Donenfeld
888cad7f2a Delay exit from HaltEx
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-06-07 12:28:07 +02:00
Simon Rozman
c19da2a99e Revise TunCompleteRequest() and make it universal
TunCompleteRequest() no longer sets Information field in IRP and allows
to specify custom priority boost. This makes it suitable replacement for
all "set status; complete request; release remove lock"-tuples
throughout the code.

Functional changes in this patch:

- We no longer reset Information field to 0 for canceled IRPs. In other
  words: ReadFile() of a canceled IRP will get the number of bytes read
  before request was canceled in the lpNumberOfBytesRead, instead of
  always 0.

- After write is complete, we boost user thread priority by +2
  (IO_NETWORK_INCREMENT).

Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-03 10:53:11 +00:00
Simon Rozman
b9d0b301b8 Revise IRP_MJ_WRITE error reporting
The NDIS reason why TunCheckForPause() reported the adapter unavailable,
might not make much sense to a client. In case adapter is paused or in
low power state, the STATUS_CANCELLED (ERROR_OPERATION_ABORTED) is
returned.

Exchange buffer size overflow - total TUN_EXCH_MAX_IP_PACKET_SIZE or
individual packet - rejects entire exchange buffer now.

Exchange buffers containing non-IPv4 or non-IPv6 packets are now
rejected as a whole.

Allocation errors while preparing NBLs from the exchange buffer are now
considered fatal.

Ensure write buffer has at least sizeof(TUN_PACKET) left, or reject
entire exchange buffer.

Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-03 10:53:11 +00:00
Simon Rozman
adefe271a1 Switch to pending writes
Commentary from Jason:

Problem statement:
  We call IoCompleteRequest(Irp) immediately after
  NdisMIndicateReceiveNetBufferLists, which frees Irp->MdlAddress.
  Since we've just given the same memory to
  NdisMIndicateReceiveNetBufferLists (in a different MDL), we wind up
  freeing the memory before NDIS finishes processing them.

Fix possibility 1:
  Move IoCompleteRequest(Irp) to TunReturnNetBufferLists. This requires
  reference counting how many NBLs are currently in flight that are
  using an IRP. When that drops to zero, we can call IoCompleteRequest
  (Irp).
Problem:
  This means we have to block future wireguard-go Writes until *all*
  NBLs have completed processing in the networking stack. Is that safe
  to do? Will that introduce latency? Can userspace processes sabotage
  it by refusing to read from a TCP socket buffer? We don't know enough
  about how NdisMIndicateReceiveNetBufferLists works to assess its
  characteristics here.

Fix possibility 2:
  Use NDIS_RECEIVE_FLAGS_RESOURCES, so that
  NdisMIndicateReceiveNetBufferLists makes a copy, and then we'll simply
  free everything immediately after. This is slow, and it could
  potentially lead to wireguard-go making the kernel allocate lots of
  memory in the case that NdisAllocateNetBufferAndNetBufferList doesn't
  ratelimit its creation in the same way Linux's skb_alloc does.
  However, it does make the lifetime of Irps shorter, which is easier to
  analyze, and it might lead to better latency, since we don't need to
  wait until userspace sends its next packets, so long as Ndis'
  ingestion queue doesn't become too large.

This commit switches from (2) to (1).

Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-03 10:53:11 +00:00
Jason A. Donenfeld
b6cc08d574 Force handles closed if required
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-06-03 10:53:11 +00:00
Jason A. Donenfeld
44ddfbe357 Clear NBLs on PnP notification
Otherwise Pause&Halt aren't called.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-06-03 10:53:11 +00:00
Simon Rozman
156294bb6e Clear internal NBL queue on transition to MediaConnectStateDisconnected
When adapter is in disconnected state, NDIS does not send it any NBLs.
After transition to disconnected state it should return all pending NBLs
back to NDIS, otherwise a deadlock occurs on pause attempt later.

Likewise when the adapter is in low-power state.

Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-03 10:53:11 +00:00
Simon Rozman
17e9e17826 Upgrade to VS2019 and update CSQ locking for analysis
Signed-off-by: Simon Rozman <simon@rozman.si>
2019-06-03 12:47:05 +02:00
Jason A. Donenfeld
c0c1bae82e Turn on compiler speed options
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-06-03 12:47:05 +02:00
Jason A. Donenfeld
b13ecdf97a README: Fix padding calculation
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-05-15 12:13:25 +02:00
Jason A. Donenfeld
bcb398bd62 installer: better WoW64 language
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-04-26 15:10:31 +02:00